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PREFACE 


This text has been written on the premise that, for students of the behav- 
ioral sciences, the first course in statistics should be definitely within the 
subject-matter field. Accordingly, it emphasizes quantitative thinking in 
Psychology and education, areas from which materials for examples and 
exercises have been drawn. : 

An attempt has been made to reduce "symbol shock," often a difficulty 
for the beginner in psychological and educational statistics. One of the 
advantages of starting with categorical data is that the introduction of 
unfamiliar symbols and processes can be gradual. However, by the time 
the book has been studied throughout, the student's acquaintance with 
Statistical symbols and concepts should be sufficient for reading the major 


Portion of contemporary psychological research. 
The order selected for the presentation of certain major topics owes 


much to the thinking of S. S. Stevens, as expressed in “Mathematics, 
Measurement, and Psychophysics,” in the Handbook of Experimental Psycho- 
logy. After an overall view of statistics in experimental and professional 
Psychology, there are two chapters involving description by counting or 
enumeration, then a chapter on methods based on ranking, followed by 
SIX chapters involving summing and averaging of continuous measures. 
These six chapters include a systematic presentation of linear correlation. 


ix 
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The two chapters on distribution functions are mathematical, but are 
presented at a simple and somewhat intuitive level. Much of the difficult 
mathematics in advanced statistics has to do with probability curves. 
Consequently, discussion centers on the logic of the functions, together 
with some of their applications. The separation of the chapters on 
inference and on simple analysis of variance from the chapters on prob- 
ability is merely for clarity of presentation. 

In a one-semester course, the instructor may find it convenient to omit 
one or more of the final topics: test construction, matrix algebra, factor 
analysis, and distribution-free statistics. Just as the chapter on the analysis 
of variance can be considered as preliminary orientation to an advanced 
course in experimental design, so these final chapters aim to give an intro- 
duction to advanced work in psychometrics, factor theory, and other 
specialized topics. 

While no familiarity with mathematics beyond elementary algebra is 
assumed, an attempt has been made to emphasize the mathematical 
reasonableness of common descriptive statistics, both as a basis for the 
Prediction of behavior in individual instances and as a means of generali- 
zation in more broadly oriented studies. Although the emphasis is on the 
logic of statistical procedures, attention has been given to the presentation 
of efficient methods of computation, both by hand and by desk calculator, 
with some reference to electronic computers. 

In an era in which most psychologists are applied psychologists working 
with patients in clinics and hospitals, students in educational institutions, 
clients in counseling organizations, employees in business and industry, 
and officers and men in the military services, an emphasis in the elementary 
course on the statistics applicable to individual differences seems to be 
appropriate. However, it is believed that statistical methods useful in 
solving Psychological problems of general scientific interest have not been 
neglected. 

Considerable thought has been given to the matter of statistical symbols. 
A survey of the symbols used in statistical texts written by psychologists 
and educators showed wide variation in usage, with authors frequently 
introducing Notation of their own. For the concepts for which notation is 
relatively uniform established usage has been followed. In other cases 
what seems to be the best current practice has been followed. In special 
instances minor innovations have been introduced. 
ie a authors and publishers for permission to re- 
were Ge a | 2 or which due acknowledgement is made in each 
Біле Eon e und Жк үч indebted to the late Sir Ronald A. 
Oliver and Boyd Led Pda MEUS Rothamsted algoto Messrs. 
IV, Vand Vi mm bs erus Fs , for permission to reprint Tables ІШ, 

atistical Tables for Biological, Agricultural 
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and Medical Research. | am also indebted to Professor E. S. Pearson and 
the Biometrika Trustees for permission to reproduce portions of W. F. 
Sheppard's New Tables of the Probability Integral, which appeared in 
Volume 2 of Biometrika. 

Table Z was prepared at the Washington University Computing 
Facilities, supported in part by National Science Foundation Grant 
No. G-22296. 

For helpful suggestions in the preparation of this text, I am grateful 
both to Dr. Gardner Murphy, long-time psychology editor for Harper & 
Row, and to the incoming editor, Dr. Wayne H. Holtzman. For detailed 
and exceedingly useful comments I am particularly indebted to Dr. Clarke 
W. Crannell of Miami University, Oxford, Ohio, Dr. Walter L. Deemer 
of the U.S. Air Force, Washington, D.C., and Dr. Robert W. Heath of 
Educational Testing Service, Berkeley, California. Others whose comments 
are much appreciated include Miss Lolafaye Coyne and Dr. Riley K. 
Gardner of the Menninger Foundation, Topeka; Dr. Robert I. Watson 
of Northwestern University, Evanston; Dr. G. Douglas Mayo, LTJG 
A. A. Longo, and Mr. David S. Thomas of CNATECHTRA, Naval Air 
Station, Memphis; Dr. Marilyn K. Rigby of St. Louis University; Dr. 
Winton H. Manning of Texas Christian University, Fort Worth; Dr. 
David K. Trites of the Civil Aeromedical Research Institute, Oklahoma 
City; Dr. Kenneth S. Teel of the Autonetics Division, North American 
Aviation Corporation; Dr. E. Muriel J. Wright of San Fernando Valley 
State College, Northridge, California; Dr. Daniel S. Lordahl of the 
University of Miami, Coral Gables; and Mr. Edward V. Hackett of Mem- 
Phis State University, Memphis. Colleagues at Washington University 
have been very helpful with their suggestions, particularly Dr. James M. 
Vanderplas, Dr. Richard H. Willis, Dr. Norman L. Corah and Mr. King 
M. Wientge. Numerous useful comments have been made by students 
Who have used the material in class, especially Mrs. Virginia Proctor, Miss 
Charlan Nemeth, and Mr. J. Philip Miller. To all of these individuals I 
express my sincere appreciation for their helpfulness. Responsibility for 
errors and ambiguities remaining in the text belongs of course, to the 
author. For typing the entire manuscript and assistance with many of the 
details of its preparation I am grateful to Miss Madeline Coran. 


Washington University PHILIP H. DUBOIS 


St. Louis 


AN INTRODUCTION TO PSYCHOLOGICAL STATISTICS 


STATISTICS 

IN EXPERIMENTAL 
AND APPLIED 
PSYCHOLOGY 


1 


AIMS OF PSYCHOLOGICAL STATISTICS 


Science is built upon planned, systematic observations. Collected with 
reference to definite hypotheses, observations are quantified and used in 
the development of principles and laws. Information obtained in some 
scientific investigations has a high degree of precision so that relationships 
Can be stated more or less exactly. Psychology, however, requires the use 
of statistical methods, developed to deal with data that involve con- 
Siderable unexplained variation, but which are often capable of yielding 
important generalizations. With statistical methods, imperfect relation- 
Ships can be described, and the dependability of a set of observations can 
be estimated. 
In Psychology, statistical methods have six important objectives: 


1. The refinement of measures! used to describe in numerical terms defined 


aspects of the behavior of individuals; 
2. The description of characteristics of individuals and of groups in terms 


of these measures; 
ge ede келер 


16; А 
Since values obtained from these measures vary from person to person they are called 
variables or variates. 
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3. The description of relationships among these variables; 

4. The generalization of findings within specific samples of individuals to 
wider populations; 

5. The prediction of the behavior of individuals under specified conditions; 
and 

6. The estimation of the consistency or reliability of information. 


Both generalization and prediction are based upon descriptive statistics. 
After a specific sample (representative of cases not yet studied) has been 
precisely described, it often becomes possible to formulate widely applicable 
Principles and to forecast aspects of the behavior of individuals not yet 
observed. The term descriptive statistics refers to procedures for simpli- 
fying quantitative information so that the structure or form of the data 
becomes easier to perceive. Methods are either graphical or numerical. 


DESCRIPTIVE STATISTICS: GRAPHICAL METHODS 


A useful graphical technique is the preparation of a “pie chart" to show 
numbers or relative proportions in several categories. (Example 1.1.) 


EXAMPLE 1.1 


PREPARATION OF A PIE CHART 


Some 
objection 


Strongly 


2 prefer 
ome 46% 
Preference 

37% 


FIG. 1.1. PREFERENCE: 
INSTRUCTION S OF 70 INDUSTRIAL TRAINEES FOR PROGRAMMED 


Hughes 
"s Cem McNamara (5) used programmed instruction for 70 students in an 
y course in data processing. At the termination of the course, the 
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Students were asked: “Іп future company courses you may take, would you like 
to see the programmed instruction method used in place of the regular class- 
room method?” Responses to the question are shown in the table. 


DEGREES 
RESPONSE y PERCENTAGE (CIRCLE — 360*) 
Strongly prefer 32 46 165 
Some preference 26 37 134 
Don't care 3 4 15 
Some objection 9 13 46 
Strongly object 0 0 0 
TOTAL 70 100 360 


Entries in the percentage column were found by multiplying each frequency (f) 
by 100 and dividing by the total number of cases (N). Entries in the column 
headed “Degrees” were found by multiplying f by 360 and dividing by М. In 
Constructing the pie chart, shown as Fig. 1.1, each sector is bounded by radii of 
the circle, separated by the number of degrees appropriate to the frequency and 


corresponding percentage. 


With continuous data such as those representing scores on a psycho- 
logical or educational test, a histogram or frequency polygon may be 
constructed to show characteristics of the distribution. Such characteristics 
may include the location of the central point in terms of a corresponding 
Score; how much the scores vary from the central point; and whether the 
Scores are symmetrically distributed about the central point or whether 
the distribution is lopsided or skewed. (Examples 1.2 and 1.3.) 


EXAMPLE 
PREPARATION OF A HISTOGRAM 


Purpose. A histogram or column diagram is a convenient format to show the 
Shape of a distribution. 

Method. Generally, the vertical axis (the y axis, or ordinate) shows the fre- 
quencies, and the horizontal axis (the x axis, or abscissa) shows the values, with 
the higher values toward the right. 

In each step or category a horizontal line is drawn at the vertical point repre- 
Senting the frequency. These lines are then connected with vertical lines, which 


Sometimes are extended down to the x axis. 
If the area of the entire surface is taken as 1.000, the area of each column is 


Proportional to the frequency within the step. 

Data Represented. A symmetrical distribution of scores of 1016 high school 
Seniors on a reading test is given below and is depicted graphically as Fig. 1.2A. 
A step interval of 3 is used; that is, all scores of 3, 4, or 5 are tabulated on the 


1.2 
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NUMBER OF CASES 


240 
220 
200 
180 
160 
140 
120 
100 

80 

60 

40 


20 


STEP LIMITS 2.5 5.5 85 115 145 175 205 235 265 295 325 355 
STEPS 02 3.5 68 9-11 1214 15.17 18-20 21-23 24.26 2729 30.32 33.35 


FIG. 1.2А. HISTOGRAM SHOWING SCORES OF 1016 HIGH SCHOOL SENIORS ON A 
READING COMPREHENSION TEST 


lowest step; all scores of 6, 7, or 8 on the second step; all scores of 9, 10, and 11 
on the next step; and so on. The frequency and the proportion in each category 
are given in the accompanying table. 


STEP FREQUENCY PROPORTION 
33-35 3 .003 
30-32 14 014 
27-29 38 1037 
24-26 129 4127 
21-23 191 188 
18-20 233 229 
15-17 215 212 
12-14 128 .126 
9-11 52 .051 
6-8 10 010 
3-5 3 003 
N — 1016 


A Second Histogram. Da 
of Air Force pilots having 
period is highly asymmetrical, or skewed, since there were far fewer accidents 
€ vast majority had no accidents at all. (Most of these accidents 


involved damage only to aircraft or to other property.) 
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NUMBER OF PILCTS 


5 ; e NUMBER OF ACCIDENTS 


FIG. 1.2B. DISTRIBUTION IN HISTOGRAM FORM OF ACCIDENTS OF 17,952 U.S. 
AIR FORCE PILOTS DURING AN 8-YEAR PERIOD 


On the histogram, numbers of pilots are represented on the y axis and number 
of accidents on the x axis. In addition, the precise number of pilots within each 


accident category is indicated in the column. 


EXAMPLE 1.3 


PREPARATION OF A FREQUENCY POLYGON 


Purpose. Like the histogram, the frequency polygon in Fig. 1.3 shows the shape 
of a distribution. The data here are the scores of 1016 high school seniors on the 
reading test used also in Example 1.2. 

Method. Instead of a line at the top of each column to represent the frequency 
in each step, a point is placed at the midpoint of the step. The vertical position of 
the point represents the frequency. The several points are then connected directly 
with straight lines to form the frequency polygon. The last line of the polygon on 
either side ends at the midpoint of the zero-frequency class adjacent. 

Both end points of the frequency polygon are thus on the base line, and the 
area under the polygon is equal to the area under the corresponding histogram. 
As with the histogram, area is used as a representation of frequency. 

Smoothing the Polygon. Sometimes, to visualize the distribution as it would 
be if the effect of sampling errors were reduced, the curve is “smoothed ". One 
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FREQUENCY 


240 | 


220 
200 


a: 
02 35 68 911 1214 15.17 


— 1 fl 
18-20 21-23 24-26 27-29 30-32 33-35 36-38 
SCORES 


FIG. 1.3, FREQUENCY POLYGON SHOWING SCORES OF 1,016 HIGH SCHOOL 
SENIORS ON A READING COMPREHENSION TEST 


(Data the same as in Figure 1.2A.) 


method involves plotting, not the obtained frequency in each step, but the aver- 
age (called the “moving average") of three frequencies, those of the preceding 
Step, the step itself, and the following step. 

Charts involvin 
individuals enable 
is marked or slight 


whether the relati 
lin 


8 two sets of measurements for the same group of 
one to see whether the association between the variables 
- When the association is definite, a chart can indicate 


onship is better expressed mathematically as a straight 
€ ог as some sort of a curve. (Examples 1.4, 1.5, 1.6.) 


EXAMPLE 1.4 


TWO VARIABLES NO RELATIONSHIP 


Ist Class (“Cours Supérieur") [49 words | 

2nd Class [4.8 words | i 
3rd Class [4.9 words | 

4th Class (“Cours élémentaire”) [46 words | 


FIG. 1.4. AVERAGE NUMBER OF wor 
REPETITION OF A SERI oom 


CED PRECISEL 
ES OF SEVEN WORDs CISELY AFTER A SINGLE 
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Source of Data. In a study published 10 years before his first intelligence scale, 
Binet (1) was apparently surprised to find no relationship between class standing 
and memory span for digits. Subjects were 32 children between the ages of 7 and 
12 years in each of 4 classes. 

Method. His negative findings are presented in graphic form in Fig. 1.4. The 
length of each bar is proportional to the obtained average. 


EXAMPLE 


TWO VARIABLES WITH MARKED RELATIONSHIP 


Source of Data. In a pioneer study in applied psychology, Thurstone (12) 
found a definite relationship between scores on a rhythm test and later success 
in telegraphy, measured in terms of receiving speed. 


TABLE 1.1. TWO-WAY FREQUENCY DIAGRAM 


ERRORS ON RHYTHM TEST 


RECEIVING 

SPEED, WORDS 32- 28- 24- 20- 16- 12- 8- 4 0- 

PER MINUTE 35 1 27 25 19 15 П 1 3 TOTAL 
12 or more 0 0 1 1 1 1 1 2 7 14 
10-11 1 1 1 1 0 7 0 0 2 13 
8-9 0 0 1 0 2 3 1 1 1 9 
6-7 0 1 3 0 4 3 1 4 0 16 
4-5 1 1 2 3 3 2 2 0 0 14 
2-3 0 1 0 4 3 0 2 0 0 10 
0-1 1 3 0 1 2 0 0 0 0 7 

TOTAL 3 7 8 10 15 16 7 7 10 (N = 83) 


Two-Way Frequency Diagram. Findings are presented in Table 1.1. On the 
vertical axis, or ordinate, desirable values (fast receiving speed) are toward the 
top of the distribution and undesirable values are toward the bottom. On the 
horizontal axis, or abscissa, desirable scores (freedom from errors) are toward 
the right. 

It will be noted that there are relatively few cases in the upper left-hand corner 
of the diagram or in the lower right-hand corner. Instead, there is definite con- 
centration of cases along the line that might be drawn from the lower left-hand 
corner to the upper right-hand corner. This indicates a tendency for low receiving 
speed to be associated with errors on the rhythm test, and vice-versa, 

Dichotomized Diagrams. Three charts representing the same data are also pre- 
sented as Figs. 1.5A, 1.5B, and 1.5C. 

In Fig. 1.5A, the percentage with 15 or fewer errors on the rhythm test has 
been plotted for the seven groups according to receiving speed. It will be noted 
that the higher the receiving speed, the greater the percentage having 15 or fewer 
errors. The particular dividing point is arbitrary. However, it divides the total 
group of 83 into two approximately equal subgroups, which means that the 
average percentage is not far from 50 percent. 


1.5 
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RECEIVING f WITH 
SPEED, WORDS 15 ERRORS 
PER MINUTE F OR LESS 

12 or more 14 11 79% | 
10-П 13 9 [69% 1 

8-9 ӘР 76 67% 4 

e 6 в 

4-5 14 4 az ] 

23 6 2 ря 

0-1 7 0 0% 


FIG. 1.5A. PERCENTAG 


E AT EACH RECEIVING SPEED WITH 15 ERRORS OR LESS ON 
PREDICTIVE TEST 


For each category of receiving speed, the frequency of cases with 15 errors or 
ss is found by combining appropriate cells in Table 1.1. 

In Fig. 1.5B the same basic information is presented as an “ Expectancy Chart.” 
The group has been divided into three categories according to standing on the 


rhythm test, and differential expectancy of attaining 8 words per minute receiving 
Speed has been plotted. 


le: 


f WITH 
ERRORS ON 8 OR MORE 
RHYTHM WORDS PER 
TEST f MINUTE 
_————— АЛЕ. 
0-11 ^ 15 [63% ] 
12-23 41 
24-35 18 


ERCENT ATTAIN 


FIG. 1.5В Р ING 
RECEIVI ER MINUTE OR 
BETTER FOR THREE Gna VING SPEED OF 8 WORDS P 


PS ON RHYTHM TEST 


From Table 1.1 it can be seen that of the 24 individuals who had fewer than 
21 errors on the rhythm test, 15 hadreceivi: 


(15/24 — .625). According} 


percentages are determined similarly, 


When the standard is changed to 6 words per minute, as in Fig. 1.5C, the 
general picture is practically the same. 
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f WITH 
ERRORS ON 6 OR MORE 
RHYTHM WORDS PER 
TEST f MINUTE 
0-11 24 20 [83% 
12-23 4j B [56% | 
24-35 18 9 50% 


FIG. 1.5С. PERCENT ATTAINING RECEIVING SPEED OF 6 WORDS PER MINUTE OR 
BETTER FOR THREE GROUPS ON RHYTHM TEST 


EXAMPLE 1.6 


CURVILINEAR RELATIONSHIP BETWEEN TWO VARIABLES 


Nature of Data. Two instances of curvilinear relationship between two variables 
are shown graphically in Fig. 1.6. The upper curve (single line) shows the relation- 
Ship between age and the mean sum of “scaled scores” on six verbal tests of the 
Wechsler adult intelligence scale (15). The lower curve (double line) shows the 
relationship between age and mean sum of “scaled scores" on five Wechsler 


MEANS OF SUMS OF SCALED SCORES 


65 


60 


Verbal tests 


55 


50 F} 


45 


40} 


Performance tests 


35r 
зор 


25 


0 u a a O 
15 20 25 30 35 40 45 50 55 60 65 70 75 80 


FIG. 1.6. SCALED SCORES CORRESPONDING TO VERBAL 1.0. OF 100 
AND PERFORMANCE 1.0. OF 100 AT DIFFERENT AGE LEVELS 
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performance tests. (On the 11 subtests, obtained scores are converted to “scaled 
scores” to make them comparable from test to test. Methods of scaling are 
treated in later chapters.) Information for both is abstracted from Wechsler's 
manual (15). Values are plotted at the midpoints of the age groupings. Data as 
reported by Wechsler are given in the following table. 


AGE VERBAL PERFORMANCE 
16-17 54.6 48.8 
18-19 57.3 49.4 
20-24 59.5 50.6 
25-34 60.8 49.5 
35-44 60.2 46.1 
45-54 58.0 41.1 
55-64 55.8 37.1 
65-69 54.0% 34.5% 
70-74 48.04 29.54 
75- 44,04 25.04 


а Data taken from tables of norms. 


‚ but with performance material, it appears that 
es. It is to be noted, however, that the successive 
age groups involve different individuals. Had the same persons been followed 
through their life span, findings might have been greatly altered. 


A time dimension is included in many charts. In reports of psychological 
Tesearch, the horizontal axis is frequently used to show units of time, such 


as trials in learning studies, and the vertical axis to show units of pro- 
ficiency, (Example 1.7.) 


EXAMPLE 1.7 


LEARNING CURVE WITH TIME DIMENSION 


g the progress of learning, it is customary to plot a 


asure i : Ñ : 
ias т um Such as seconds required for each trial on the vertical 
Gm dads n е time dimension, often in terms of trials, on the horizontal 


hods become fairly complicated, as when an 


-dimensional space. 
charts, however, lose their point if they are not 
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TIME IN SECONDS 


240 
180 | 

шй 

60 

91 2345 10 19 20 в m 


FIG. 1.7. AVERAGE TIME SCORES AT SUCCESSIVE STAGES OF LEARNING IN 


TERMS OF TRIALS 

of direct assistance in understanding the data they represent. Sometimes 
simple charts show that laborious computations are unnecessary, or r eveal 
unanticipated trends requiring more complete investigation. When re- 
Search results are reported to an audience not technically trained, a graph 


may make clear an otherwise unintelligible finding. 
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TABLE 1.2.DATA FOR FIG. 1.7 


TRIAL TIME IN SECONDS 
1 240 
2 180 
3 153 
4 118 
5 93 
6 72 
f 58 
8 50 
9 43 
10 35 
15 21 
20 15 
25 10 
30 8 


EXAMPLE 1.8 


RELATIONSHIP AMONG THREE VARIABLES 


Previous flyi 
A. Pilot's license. 


B. Student pilot certificate with solo privileges. 
C. Student pilot certificate. 


D. Experience as passenger in plane, no formal instruction. 
No experience in air. 


» Previous experience can 
сотре р \ д 

Pensate for low aptitude, and high aptitude can compensate for lack of 
experience, 
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A 
% А В e 
Pilot's Student Student Passenger Мо previous 
license pilot pilot only experience 


PREVIOUS FLYING EXPERIENCE 


ABLE RELATIONSHIP SHOWING 

NATED BY CATEGORIES ACCORDING 
ORDING TO PILOT APTITUDE RATING 
NATED FOR FLYING DEFICIENCY 


FIG. 1.8. REPRESENTATION OF THREE VARI 
PERCENTAGES OF AVIATION CADETS ELIMI 
TO PREVIOUS FLYING EXPERIENCE AND ACC 
N — 7826, OF WHOM 30.2 PERCENT WERE ELIMI 


There are five main results in descriptive statistics of numerical operations 


with continuous data. These are: 


- Measures of central tendency; 

- Measures of variability ; 

. Transformations of variables; 

. Measures reflecting the shapes o 
. Measures of relationships. 

The first four apply to single variables, that is, to the numbers repre- 
senting one kind of observation on а group of cases such as scores of 
individuals on a reading test. Measures of relationship apply to data 
involving two or more variables. In addition to these five classes of 
statistical measures, there are various techniques designed to determine the 
fundamental structure underlying a large number of variables, so that a few 
variables will provide the description originally requiring many measures 


f distributions; and 


ль ш м 
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MEASURES OF CENTRAL TENDENCY 


A measure of central tendency provides a single number that represents a 
whole series of numbers. An example is the familiar “average,” which in 
Statistics is called the arithmetic mean to distinguish it from various other 
averages known to mathematics. The arithmetic mean is simply the sum of 
all the values in a series divided by the number of cases. The procedure of 
computing a mean yields a single value, which can often be taken as a 
summary of all the numbers in the series. The mean or some other measure 
of central tendency is employed in most psychological studies using 
statistical methods. 

MEASURES OF VARIABILITY 


Measures of variability are designed to describe the spread or scatter of 
numbers. In one set of observations, all the numbers may be fairly close 
together; in another, they may vary considerably one from another. When 
the variability of a series of numbers is expressed in a form technically 
known as the variance (which will be defined later), it may often be 
analyzed into component parts. Thus, in the development of a psychological 
test, an estimate is made of how much of the variance of the total score is 
reliable or consistent, and how much is inconsistent or *error variance." 
When the Proportion of error variance is low, the test yields consistent 
results, either from one administration to another or from part to part of 
the test. In somewhat similar fashion, the total variance of a variable can 
often be divided into two parts: the portion that is predictable from one or 
more sources, and the portion that remains unpredictable. The pre- 


dictable portion, in turn, may be subdivided so as to indicate the relative 
Importance of the several pr 


: anding of the same individual on several variables 
may be readily compared. 


DESCRIPTION OF DISTRIBUTIONS 
Another use of Statistical measures is to describe the shapes of distributions. 
The shape can be readily assessed in à general way from a graph, but 
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measures exist that help to describe more precisely whether a distribution 
is flat or peaked, as well as the degree to which it departs from symmetry. 


MEASUREMENT OF RELATIONSHIP 

A fifth use of descriptive statistics is to measure the degree of relationship 
between variables. In physical sciences such as physics and astronomy, it is 
often possible to state the relationship between two variables as a relatively 
exact equation, with negligible error. In psychology, an equation expressing 
the relationship between two variables can often be established, but 
typically the association is partial. Stating the same fact in different 
language, the description of the relationship between pairs of psychological 
Variables is imprecise because various conditions influencing their re- 
lationship remain unknown. Methods to describe partial relationships 
between pairs of variables and between combinations of variables are 
useful both in formulating general psychological principles and in fore- 


casting behavior of individuals. 


STATISTICS APPLIED TO GENERALIZATION 


The determination of the degree to which the results of descriptive statistics 
are likely to apply universally or to a large population is another function 


of statistics. ее 
Statistics such as measures of central tendency, variability, and re- 


lationship can be actually computed only for specific arapi iei ды 
Within the limitations of the methods employed in observing, recording, 
and computing, these statistics are exact. However, wie tc ad 
we wish to infer as much as possible about the unobserve : ua. a in E 
Population which the sample represents. The —— un ien e 
values in the population corresponding to the statistics 1n Е sample = 
designated as parameters. Although unknowable, pornn - can ^ 
estimated. The degree of precision in the estimates can also be с 
Hence it is often possible to reach fairly dependable conclusions about the 
б Р 
v np to which estimates of parameters actually correspond to the 
Situation in the population depends primarily on how Foe ied Өле 
represents the population. This, in turn, depends chiefly on p actors: the 
methods used in selecting the sample and the sample size. Obviously, the 
more successful the precautions have been in making the sample truly 
representative of the entire population, the closer the statistics will approxi- 
mate parameters. Again, within any valid method of selecting a representa- 
tive sample, the greater the number of cases, the closer the statistics will 
represent the parameter values. A third consideration is the particular 
statistic used. Some statistical measures are unbiased and, without cor- 
rection, can be taken as representative of parameters. Others are biased 
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and must be corrected. Often the correction is for sample size, and be- 
comes of less consequence as sample size increases. 


STATISTICS OF PREDICTION 


Statistical prediction, or forecasting, like statistical inference, is based upon 
the description of a relationship observed within a sample. After the 
inductive process of making a psychological generalization comes the 
deductive process of applying it to an individual who was not measured 
as part of the original group. 

One of the simplest types of forecasting involves two steps. The first step 
is to establish the differential expectancy of an event for several categories of 
individuals. The second is to determine to which of these categories a 
Particular individual belongs. In this way the determination of the category 
or class to which the individual belongs reveals the likelihood of the event. 

More elaborate mathematically, but much the same logically, is the 
Procedure that involves finding the degree of relationship between two 
variables, followed by the establishment of an equation for forecasting 
Purposes. Suppose, for example, that in a representative sample of high 
school students a substantial relationship has been discovered between 
Scores on a scholastic aptitude test and grades in English composition. If 
we have the scholastic aptitude score of a student who has not yet taken 
the course in English composition, we can make a reasonable prediction 
of his grade in the course. Furthermore, we can estimate with considerable 
exactness the amount of error in a set of such predictions for a number of 
individuals, always provided the sample studied is truly representative of 
the population about whom predictions are to be made. 

The logic used in applied psychology, including making diagnoses in 
clinical work, selecting candidates for admission to schools, and hiring 
employees, is the logic of prediction. Relationships are observed in specific 
samples. Knowledge of these relationships, taken as generalizations, is 
used in forecasting the behavior of individuals. In educational institutions 
and in the armed Services, selection procedures are often based directly 
on carefully developed prediction equations. In clinical work with ab- 
normal patients and in educational and vocational counseling, procedures 
are sometimes less formal because criteria are less exact, but sound 
psychological practice always demands that relationships within a repre- 
sentative sample be ascertained before predictions are made for individuals. 


SYSTEMS OF DESCRIPTIVE STATISTICS 


There are three main Systems of statistical description by numerical 
methods: counting, ranking, and averaging. Although the systems are 
based upon different principles and employ different arithmetical opera- 
tions, all are useful with Psychological data. With some information, such 
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as time and error measures in laboratory experiments and scores on 
educational and psychological tests, the statistical procedures of all three 
systems are customarily applied. For other data, counting and ranking are 
appropriate, but averaging is not. For the simplest type of information, the 


basic operation is counting, or enumeration. 


COUNTING 

A fundamental type of description used in psychology is the so-called 
nominal scale?, which is not a scale in the ordinary sense of the word but 
simply two or more classes in the same general domain. Sex (male or 
female), race (Caucasian, Mongoloid, or Negro), and eye color (brown, 
blue, green, gray, or hazel) are examples of nominal scales. In using a 
nominal scale, the classes or categories are defined so that each individual 
is placed in one and only one class in the series of classes. After individuals 
are so classified, counts are made of the number in each of the categories, 
and arithmetical operations based on these counts are performed. Such 


Operations are described in Chapters 2 and 3. 


RANKING 


А second type of quantific 
Ordinal scale, which comprises cases or С 
from high to low, with regard to a characteristic. 
runners in the order in which they finish a race, lining up a squad of 
recruits according to height, and arranging a set of colored papers accord- 
Ing to gray value are examples of the use of rudimentary ordinal scales. 
f no ties are permitted, each pair of cases is, in effect, judged as to which 
member possesses the characteristic to the greater degree. If ties are 
involved, the judgment of equality of the attribute in two or more cases is 
Permitted. This is, of course, the type of judgment required for the simpler 
Nominal scale. Formal ordinal scales have been used for the measure- 
ment of the magnitude of stars and the hardness of metals. 

While order within a series is sometimes used in psychological research, 
the ranking system is also useful with data originally obtained in the form 


Of scores. From time to time there has been debate as to whether psycho- 
nits or whether relative order 


logical measures yield scores in meaningful u 5 
within some defined group is the extent to which significant information 
can be extracted. Complete agreement has not been reached on this 
Point. It is common practice, however, to treat test scores both as rankable 
information and as having meaningful units. Also, information in the 
form of ranks is often summarized through conventional arithmetical 


ation commonly used in psychology is the 
ategories placed in order, as 
Assigning numbers to 


— = 


: Ул 
Ап informative discussion of types of scales and of statistical measures applicable to 
em is given by Stevens (10). 
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operations, including addition. Statistical methods based on ranking are 
discussed in Chapter 4. 


SUMMING AND AVERAGING 


When we can measure in units (as contrasted with merely counting or 
arranging in order), descriptive statistics based on sums and averages are 
clearly appropriate. There are two types of information to which sum- 
mation methods apply directly: interval scales and ratio scales. The 
difference between them is that a ratio scale starts at a true and known 
zero, whereas an interval scale starts at some arbitrary point. Among 
physical measures, temperature is commonly measured on interval scales; 
weight, length, and duration of time on ratio scales. 

Scores on psychological and educational tests do not constitute ratio 
scales because an obtained score of zero seldom if ever indicates that the 
individual possesses none of the trait being measured. Such scores are, 
however, often treated as though they were from interval scales. 

Basic descriptive statistics in the summation system are discussed in 
Chapters 5 through 9. 


STATISTICAL NOTATION AND FORMULAS 


Although theoretical statistics is a highly developed branch of mathematics 
rooted in probability theory, elementary statistics as used by the typical 
research or applied psychologist requires chiefly algebra as a mathematical 
background. A few new symbols and operations extend arithmetic and 
algebra into an important tool of psychological research and practice. 

Many of the concepts are stated as formulas, the most common of 
which are ways of indicating in precise notation the operations or steps 
used in treating data so as to arrive at a Statistic. 


FIVE VARIETIES OF STATISTICAL SYMBOLS 


Statistical formulas include five distinct varieties of symbols as follows: 

1: Symbols indicating statistical concepts or end products of statistical 
Operations, such as N for the total number of cases (a product of counting) 
and M for the mean (a product of averaging). After computation, end 
products of this sort become “statistical constants" because they have 


unique values for a set of data, and often enter as constants into formulas 
for further analyses, 


2. "Operators," or symbols 
operations, such as a line. ( 
Ра з . А . $ 
sign,” 2, for indicating that a series of 


indicating one or more arithmetical 
), for division, or the summation 
numbers is to be added. Most of the 


3 Although the Greek capital letter si 


a е gma is used as the summation si it i 
read “sum of " rather than “sigma.” қаны 
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symbols used in elementary algebra to indicate basic arithmetical opera- 
tions, such as addition, subtraction, multiplication, division, raising to a 
power, and extracting a root, are used in statistics. 

3. Symbols, such as X and Y, to indicate variables. These symbols, 
somewhat like their algebraic counterparts, represent the different arith- 
f numbers. Thus X may be used to represent the 
grade boys. In the main, the last letters 
iables. In this text, however, Z is never 
ducing possible confusion with z, 
transformed in a certain manner. 
nd letters early in the alphabet, 
f the symbols for end products 
Iso appear in formulas as 


metical values in series o 
reading test scores of 100 eighth 
of the alphabet are used for var 
used as an original variable, thus re 
which is used for variables that have been 
‚ 4. Symbols, such as cardinal numbers a 
indicating constants. As stated above, any о 
of statistical operations, such as N or M, may a 


Constants. 
5. Symbols, including subscripts and superscripts, to make precise that 


Which is denoted by other symbols. For example, the subscript x in the 
expression M, indicates that reference is made to the mean of variable X. 
Variables may be numbered, in which case M, may refer to the mean of 


variable 1 and M; to the mean of variable 2. On the other hand, the values 
correspond to cases so that X, 


of a single variable may be numbered to to cas 
represents the first observation or score for the first individual, Y; the 
second observation or score for the second individual, and so on. Symbols 
of functions involving two or more variables are also appropriately dis- 
tinguished with two or more subscripts. | p 
A bar over the letter indicating а variable is often used to indicate the 
mean. Thus X is an alternate Way of indicating Mx D. | 
Another use for this type of symbol is to show the limits of summation 
by writing one of the limits below the summation sign and the other above 


it. Thus 
i 


or, more briefly, 


indicates the operation of summing variable X from the first instance, X;, 
through the Nth case, Хм: In cases of double or triple summation, as 
der of summing is shown. The summation 


when sums are summed, the or : 
sign closest to the variable indicates the first summation; the next closest, 


the second; and so on. 
For instance 


X 


-M= 


4 
> 
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indicates that jn values of X (that is, j sets of n cases each) have been 
summed. First, each set of values is summed from the first case through 
the nth case. Then these j sums are added together from the first sum 
through the jth sum. In mathematical statistics the limits and conditions 
of summation are generally carefully indicated; in psychological statistics 
they are written out when needed for clarity. 

Another symbol included in this fifth category is the tilde ( ^), written 
abovea symbol for a variable to show that the value is predicted rather than 
obtained. Thus (X) represents a predicted rather than an obtained score in 
variable X. If a variable has been modified in some way, as by adding or 
subtracting a constant from the original values, it is sometimes identified 
with a prime (^). Thus X’ is some sort of modification of X. Primes may 
also be used to distinguish among values of statistical constants under 
Specified conditions. 

Roman letters are preferred for statistics and Greek letters for corres- 
ponding parameters, as M for the mean in the sample and и for the popu- 
lation mean. Knowledge of the convention is essential in reading advanced 
statistics. In this text, commonly recognized statistical symbols are used 
Whether or not they are in accordance with this convention. Accordingly, 
Greek letters are used for certain descriptive statistics. When needed for 
clarity, a circumflex accent (^) indicates a parameter. 

In print it is conventional to substitute italic letters for Roman, thus 
better distinguishing symbols from text. 


UNDERSTANDING A FORMULA 


Not all the implications in a formula can be grasped at a glance, even by 
one with long experience in statistics. Writers of statistical texts and 
articles vary widely in their use of notation. Another difficulty is that a 
formula may indicate, in what is essentially a kind of shorthand, a complex 
Series of numerical operations. 

The first step in understanding any formula is to be sure that the meaning 
of each symbol is clear. The second step is to perceive how the symbols are 
put together to convey the meaning the writer intended. The sequence of 
operations usually becomes clear if the formula is carefully studied. 
Operations within parentheses are performed before operations on the 
quantity enclosed by the parentheses, and operations within a summation 


sign are done before operations outside the summation sign. Thus, 


i=N 
X,— Ху — 


md N=] 


indicates the following steps: 
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. From each value of X, the mean of all the X’s is to be subtracted; 
. Each of the values of (X — M,) is to be squared; 
. The sum of all the values of (X — M,)? is to be obtained; 


. This sum is to be divided by N — 1; 
. The square root of the quotient is then to be obtained. 


іл шо о — 


As in conventional algebra, there is some flexibility in the order of 
operations. A constant multiplier (or divisor) within parentheses, or a 
summation sign, may be placed outside, thus indicating a different se- 
quence of operations without change in the final result. Accordingly, if a is 


а constant, $'aX = a) X. 


A constant within a root 


sign may be brought outside, but only if its 
power in the expression as a whole is maintained. Thus, i 


n the preceding 


example, 


Yu i m 

Ро =.———— J/X(X ps Mj 
М--1 JN -1 

it will be found that a statistical formula is 


When studied carefully, 
ons and relationships that 


generally an economical way of stating operati 
would be cumbersome to describe in words. 
STICAL FORMULAS 


THE FIVE CLASSES OF STATI 
of statistical formulas: 


Of interest in psychology are five classes 


1. Formulas defining the concepts of descriptive statis 
2. Formulas setting forth economical computing routines; | 
3. Formulas for using descriptive statistics for making estimates of 


various kinds; 
4. Formulas of mathematical functions serving as hypotheses; and . 
5. Measures of the discrepancy between a hypothesis and corresponding 


empirical findings. 


tistics; 


FORMULAS AS DEFINITIONS 
For each descriptive statistic, the basic formula constitutes an operational 


definition, that is, a schedule of the operations or Steps used in treating 
tic. The nature of measures of 


original data so as to arrive at the statis 4 
central tendency апа variability is usually apparent directly from the 
formulas, which need no derivation. Statistics reflecting the relationship 

Ive fitting a line by the math- 


between pairs of ordered variables may invo 
ematical principle of least squares,” while some may involve assumptions 


4A generally accepted mathematical convention, which states that the fit of a line repre- 
senting a joint function of two variables is best when the sum of the squares of the errors 
in fitting is as small as possible. 


3n 


i 


м ох 
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ді! < 
05-2 
ioe 
cops. % 
ГЭ уур 
і 178 


22 AN INTRODUCTION TO PSYCHOLOGICAL STATISTICS 


regarding the underlying nature of one or more variables. In such cases 
the formula still constitutes an operational definition, but the essential 
nature of the measure becomes clear only by considering the assump- 
tions involved and by following through the steps of the derivation. 


COMPUTING FORMULAS 


Formulas that define statistical concepts logically are not necessarily the 
most efficient for actual computation. For example, the basic formula for 
the variance calls for the sum of the Squares of numbers that are ordinarily 
decimal fractions. Since computations with such decimal fractions are 
generally awkward, computing formulas have been devised which yield 
more or less identical results but which are based on sums of original scores 
and their squares. 

Computing machines, from the desk calculator to the giant electronic 
computer, can handle numerical information more or less automatically. 
To utilize them to best advantage in statistical work, it is often necessary 
to rewrite formulas so as to take advantage of the economies inherent in 
their mode of operation. A computing formula is simply an algebraic 
variation of a basic formula developed so as to indicate precisely the 
arithmetical steps required by some particular method to arrive at the 


Statistic Concerned. Often it permits analyses of data in a fraction of the 
time that otherwise would be required. 


FORMULAS FOR MAKING ESTIMATES 


Formulas used in making estimates fall into three main categories: 

1. Formulas useful in esti 

from particular samples 

Formulas for estimatin 

circumstances ; and 

- Equations of best fittin 
Predictions in individu 


mating parameters from statistics computed 
2. g what the statistic would be under changed 


£ functions (straight lines or curves) for making 
al cases. 
MATHEMATICAL FUNCTIONS 


The fourth class of formulas с 
servin 


ving as hypotheses. Thes 
distribution of 


omprises two types of mathematical functions 
e are functions representing the theoretical 
cases along a continuous scale, or of observations within a 


sue of Categories, and functions representing the shape of a line of 
Telationship between two variables. 


Mathematica] Statistics 


om this hypothesis resulted purely from chance. 


Most frequently, formulas yielding distribution functions are used in 


STATISTICS IN EXPERIMENTAL PSYCHOLOGY 23 


practical situations in the form of tables. The use of these functions in 
table form provides a convenient way of determining how unusual a 
particular event would be under a stated hypothesis. 

In psychology, the equation for a straight line is frequently used for 
representing the relationship between two variables. However, any 
mathematical function representing the relationship between two sets of 
observations for the same group of cases can be used as a hypothesis. 
When the function is established, it may be used for making predictions 


in individual instances. 


MEASURES OF DISCREPANCY BETWEEN FACTS AND THEORY 


In an experimental investigation, when we compute a measure of the 
relationship between two variables, or of the difference in some statistic 
computed in two groups, Or of the difference between an obtained and a 
theoretical distribution, some degree of relationship or of difference is 
almost always found. The question is whether the observed relationship 
or difference exists in the population, or whether it could have arisen 1n 
the sample merely by chance. If it can be assumed that the sample fairly 
represents the population, it can be determined whether the obtained trend 
is great enough to indicate a real trend in the population. This is the 
function of the fifth class of formulas. | - 
Often it is formally hypothesized that there is no relationship in the 
population or that the difference is zero. Even though no relationship or 
difference is found in the particular sample or in other samples, this 
hypothesis logically can never be proved. However, if empirical findings 
are compared to the probability function of such findings, that is, with the 
distribution expected purely by chance, we may either accept the hypothesis 
of no relationship as possible, pending further evidence, or regard the 
hypothesis of no relationship as disproved, at a stated level of certainty. 


NTAL PSYCHOLOGY 


Of objects known to exist, the human nervous system is undoubtedly the 
most intricate, both in structure and in function. Not only is it elaborately 
responsive to "external events of the moment, but its behavior 1s greatly 
influenced by results of preceding events and their interrelationships. 
It is hardly surprising that a mathematical representation Am psychological 
processes such as sensing, perceiving, thinking, and learning must include 
Provision for error or uncertainty. қоз А 

In the development of all the sciences, naturalistic observation and the 
formulation of broad general principles have preceded the use of measure- 
ment and the statement of relationships in mathematical terms. However, 
as a science matures, the use of mathematics permits greater precision in 


stating principles and in the prediction of future events. 


STATISTICS IN EXPERIME 
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The development of psychological statistics is definitely related to the 
attempts of psychologists to define and measure pertinent variables, and 
to describe and understand complex phenomena. In some investigations, 
the effect of extraneous variables can be eliminated by careful selection 
and training of subjects, by the use of control groups, and by the isolation 
of observers and subjects from events that might introduce error into the 
results. In other studies, experimental controls can be replaced to some 
degree by statistical controls; that is, by the statistical removal from the 
variables directly concerned the variance associated with one or more 
disturbing variables. This is accomplished by modifying the variables of 
direct interest so that they become uncorrelated with the extraneous 
variables, 

The literature of experimental psychology is largely unreadable without 
knowledge of measures of central tendency, variability, and relationship, 
and of methods of using a series of observations within a sample as a basis 
for inferring widely applicable generalizations. Laboratory psychology 
requires considerable statistical sophistication on the part of the investi- 


gator in order to evaluate pertinent research and to communicate research 
findings to others. 


STATISTICS IN PROFESSIONAL PSYCHOLOGY 
Personnel, Counseling, and clinical psychologists depend upon statistical 
tudies for the fundamental information on which their practice is based. 


In working with individuals, statistical concepts form the basis for inter- 
Pretation of test results. 


First, 
cerned wi 
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number of pertinent variables is generally large and some usually remain 
unmeasured. As additional relevant information is collected, the known 
probabilities about a future event will often change. For certainty, the 
actual event must be awaited, but the known probabilities are often 
s for action. Thus, when a very dull child is enrolled in school, 


sufficient basi 
e in the regular curriculum before 


it is not necessary to subject him to failuri 
assigning him to studies that can reasonably be expected to be within his 
capacities. 

SOURCES OF PSYCHOLOGICAL STATISTICS 


Although statistics is often regarded as a branch of mathematics, it must 
not be supposed that psychologists have been mere borrowers from the 
warehouses of the mathematicians. A goodly proportion of the techniques 
of psychological statistics were invented by psychologists as means of 
solving psychological problems. Some of these techniques have been incor- 
porated into the body of mathematical statistics; in other cases application 
has been restricted to psychology and allied fields. 

The idea of the median is attributed to Gustav Fechner (1801-1887). In 
the history of psychology, Fechner is best known as one of the founders of 
psychophysics, concerned with relationships between external stimulation 


and resultant sensation. jo 1 

Іп connection with his studies of the inheritance of individual differences, 
Sir Francis Galton (1822-1911) developed the concept of correlation 
between two variables. Karl Pearson, his student, expanded on his dis- 
covery by the derivation of the product-moment formula for correlation, 
and numerous refinements and extensions of correlation theory and 


i is followers. 
practice were made by Pearson and his fo M 
Charles Spearman (1863-1945), professor of psychology at University 


College, London, made a number of important contributions to psycho- 
logical statistics. He worked out methods of correlating ranked data. He 
developed the concept of test reliability, and published a formula for 
estimating the reliability of a test when lengthened. He showed how to 
estimate what correlations would be obtained if the variables were freed 


of error variance. Ж ; ; 
lem of explaining the intercorrelations 


S lated the prob 
о 9 terms of a smaller number of under- 


of a group of observed variables in ; › r 
к verdes His methods laid the foundations of factor analysis, which 


attempts to reduce the number of psychological variables. Further develop- 
ment of factor analysis was also largely the work of psychologists, 
especially L. L. Thurstone (1887-1955. || 1 

In developing his scale for measuring intelligence, Alfred Binet (1857- 
1911) used the basic concepts of item analysis, including item difficulty 
and the correlation of an item with a criterion. Later these concepts became 
formalized in the work of numerous psychologists working with test data. 
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Especially influential were the psychologists trained by James McKeen 
Cattell (1860-1944) and E. L. Thorndike (1874-1949) at Columbia; also 
the group centered at the University of Chicago, who were trained by 
Thurstone and who in 1935 founded Psychometrika,? a journal * devoted 
to the development of psychology as a quantitative rational science." 

The percentile method of scaling variables was developed by Galton. 
Standard scores in various forms seem to be the work of Truman L. Kelley 
(1884-1961), William A. McCall (born 1891), and Clark Hull (1884-1952), 
the last named being better known for his work on hypnosis and on 
learning. 

For a number of years, psychological statistics has been greatly in- 
fluenced by the work of Sir Ronald A. Fisher (1890-1962) whose work was 
first applied in agriculture. His notable contributions have included the 
discovery of important theoretical distributions, knowledge of which 
permits better inference as to generalizations vali 
resented by observed samples. 


While psychologists have drawn freely on the mathematicians for 
distribution functions and on other applied fields for various specialized 
techniques, the field of psychological statistics has a claim to considerable 
autonomy. It has been largely developed by psychologists who, in con- 
nection with their research and practice, have felt the need for quantifying 
Observations, for determining their underlying structure, and for using the 


resulting generalizations both for formulation of principles and for 
Prediction of behavior in individual instances. 


d in populations rep- 


SUMMARY 


Both in PSychological resea 
is an important tool. Wit 
description. of relationshi 


usq T Operations are conveniently summarized as 
which indicate how a given statistic is obtained. 


5 The number Of st: 
journals that have 
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EXERCISES 


1. On the initiative scale of a self-description inventory, Ghiselli (4) reports mean 
scores for four occupational levels as follows: 


OCCUPATIONAL LEVEL MEAN 


Top management 33.4 
Middle management 30.7 
Lower management 29.8 
Line workers 28.3 


Show these results as a bar graph. 


r scientists about their opinion of 


2. Lawton and Goldman (7) queried 66 cance 
Opinions were as follows: 


cigarette smoking as a cause of lung cancer. 


OPINION f 
Is a cause 13 
Probably a cause 42 
Evidence equivocal 8 

3 


Probably not a cause 


Present these results as a pie chart. 
grouping of the population of the United 


3. By the use of pie charts, compare age 
1960. Source of data: Health, Education 


States in 1940 with the age grouping in 


and Welfare Indicators (14). 
POPULATION (IN MILLIONS) 


AGE GROUP 1940 1960 
65 and over 9.0 16.6 
45-64 26.2 36.1 
20-44 51.6 58.2 

5-19 34.7 48.8 
Under 5 10.6 20.3 


4. Physician interest scores of 670 university students whose occupational careers 
were known over a period of 20 years have been reported by Strong and 
Tucker (11). The distribution of scores for 108 who became physicians and 


for 562 who did not are: 


PHYSICIAN STUDENTS STUDENTS 
INTEREST BECOMING NOT BECOMING 
RATING PHYSICIANS PHYSICIANS 

A 70 63 

B+ 14 56 

B 10 73 

B— 9 73 

с 5 297 


Summarize the results іп an appropriate chart. 


28 AN INTRODUCTION TO PSYCHOLOGICAL STATISTICS 


5. Prepare a chart to show prevalence of impaired hearing according to age and 
sex, using the following data from Weiss (16): 


DEAFNESS CASES? 


AGE IN YEARS PER 1000 POPULATION 
MALE FEMALE 

Under 5 0.49 0.43 
5-14 2:95 2.26 
15-24 3.51 2.93 
25-34 4.76 4.99 
35-44 9.70 9.28 
45-54 14.93 15.45 
55-64 29.27 26.43 
65-74 73.64 54.68 
75 or over 175.08 135.95 


NUMBER NUMBER 


STANINE GRADUATED ELIMINATED 
9 119 15 
8 41 11 
7 99 24 
6 135 68 
5 or below 20 28 


7. In a study of SAS pilots, Tra 


А nkell (13) reported dismissals as follows for four 
categories of judged suitabi 


lity for employment: 


TOTAL SUBSEQUENTLY 
CATEGORY EMPLOYED DISMISSED 
Particularly suitable 49 0 
Suitable 218 8 
Doubtful 59 4 
Unsuitable 37 17 


In each category, plot dismissals as a percentage of those employed. 


MEAN NUMBER OF LETTERS REPORTED 


SESSION GROUP | GROUP 2 
1 84 5.4 
2 11.6 Ta 
3 10.0 8.2 


Present these results graphically. 


10. 


п. 


12, 


13; 


. HUGHES, J. L., AND MCNAMARA, W. J., 


. Office of the Surgeon, 


. United States Department of Health, E 


. WECHSLER, DAVID, Manual for the Wec 


. WEISS, ALFRED D., 
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DESCRIPTION 
BY COUNTING 


2 


THE NATURE OF CATEGORICAL DATA 


Assignment to cate: 
basis of a system o 

As noted in Ch: nal scale is sometimes applied to 
the simplest form of isti 


gories and enumeratin 


g or counting within them is the 
f descriptive Statistics. 
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measures employed are sometimes denoted as the statistics of attributes. 
All these terms refer to the same area of statistical description. 
Ordinarily, information defining the categories cannot be given meaning- 
ful numerical values. This is in contrast to measuring height in inches or 
centimeters and weight in pounds or kilograms. With nominal data an 
appropriate number of classes or categories is established, individuals are 
assigned to the categories, and counts are made of the individuals so 
assigned. The establishment of a series of mutually exclusive groupings 
description in a definable area rather than of 
measurement along a dimension involving a single characteristic, such as 
time or distance. For example, in an investigation of the inheritance of 
eye color we may find that four categories are sufficient: brown eyes, blue 
eyes, hazel eyes, and gray-green eyes. Although these attributes are 
certainly in a single area, it is likely that more than a single dimension is 
involved. In this case it is impossible to assign a fixed and meaningful order 
to the four classes. Nevertheless, after individuals have been assigned to 
categories, statistics can be computed to summarize group characteristics. 
Such statistics can then be applied to making inferences about the popu- 


lation from which the sample has been drawn. 


carries the implication of 


STATISTICS APPLICABLE TO CATEGORICAL DATA 
Statistics that are applicable to nominal or categorical data include: 


1. N, the total number of cases in a given sample; 
2. f, the frequency or number of cases in a category or subcategory; 
3. p, the proportion of cases within any category (which may also be 


expressed as a percentage); 
4. Мо, the mode, or category wi 
5. C, the contingency coefficient, u 
two categorical variables; 
6. x?, used in the chi-square test 
frequencies within the categorie: 


th the greatest frequency, 
sed to measure association between 


to determine whether the distribution of 
s is in accordance with some hypothesis. 


BUILDING A NOMINAL SCALE 
In all sciences a first step is the development of a classification system for 


the objects or processes studied. The biologist, for example, divides the 
plant and animal kingdoms into a hierarchy of categories from the phylum 
down through class, order, family, and genus to the species. At each level 
an indefinite number of coordinate categories are possible, constituting 
one or more nominal scales. However, the totality of categories cannot be 
considered a single nominal scale because a hierarchy implies order. 

In psychology the chief use of sets of categories comes when people can 
be classified by descriptive types- Type-like categories are often useful 


before precise measurement is possible. 
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PRINCIPLES IN ESTABLISHING NOMINAL SCALES 
There are five principles in establishing a nominal scale: 


1. All categories or classes must lie within a single area. Division must be 
on the basis of a single principle, such as pathology, or behavior 
symptoms, or pigmentation. 

2. Categories must be coordinate. Categories from higher and lower 


levels (for example, genus and species) are not to be mixed in the same 
scale, 


. Categories must be mutually exclusive. 

4. Categories must be clearly defined so that there is a minimum of 
difficulty in making assignments to them. The definition must cover all 
cases properly belonging to the class and must exclude all others. 

5. There should be a sufficient number of classifications so that each 

observed case can be definitely assigned to a category. A miscellaneous 

Category is generally undesirable, since its use means that a part of the 

group have not been assigned places on the scale. 


w 


Nominal scales developed according to these principles are important 
research tools. They provide the framework for the collection of data 


which, when treated statistically, may yield valid generalizations and lead 
to useful applications of results. 


BASIC STATISTICAL OPERATIONS WITH NOMINAL SCALES 


After a nominal scale has bee 
make decisions as to which in 
categories. As each case is 5 
belongs must be determined. 


n constructed, the first operation is to 
dividuals are to be classified in each of the 
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frequency. With types of measurements in which the observations can be 
arranged in order, from the cases showing the highest degree of the 
characteristic down to the cases showing the lowest degree, the mode 
becomes a “measure of central tendency.” In such cases it sometimes 
shows, in a rather crude fashion, the location of the middle of the distri- 


bution. 


PERCENTAGES AND PROPORTIONS 

In order to obtain a measure of the relative popularity of the t 
in a nominal scale, the percentage of the total sample that falls in each 
group can be found. This is accomplished merely by dividing each f by N 
and multiplying by 100 (or, of course, multiplying f by 100 and dividing 


the product by N). In notation: 


categories 


100f (22) 


percent f = M x 100 — N 


tages should, of course, add up to 100 


percent (with the likelihood of a small divergence from precisely 100 
percent because of rounding). By means of percentages, those distributions 
from samples that have different total N may be readily compared. If the 
categories are identical, the percentages falling into the different categories 
may be directly observed. 

Proportions (which are designated as p) ha 
centages. However, since there is no multiplicati 
a total sample add up to 1.00. In notation: 


For any given sample, percen 


ve the same intent as per- 
on by 100, proportions for 


(2.3) 


кз 
\ 
21+ 


A special case of the use of proportions is one in which only two 
categories are in the nominal scale, such as male and female or citizen and 
alien. In this case the proportion in one of the two categories is denoted as 


p and the proportion in the other as 4- Then, p + q = 1.00. | 
Example 2.1 illustrates the use of a nominal scale. It shows the making 


of tallies as individuals are assigned to categories, counting the frequencies 
to determine the f and N, and converting f to percentages and proportions. 


EXAMPLE 
USE OF A NOMINAL SCALE 
h student in elementary psychology was asked to report 


using the following four categories: brown, blue, 
ff-blue), or hazel (various shades of off-brown). 


Source of Data. Еас 
the color of his mother's eyes. 
gray-green (various shades of o 


2.1 
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Tallying the Results. After a nominal scale for the description of individuals has 
been constructed, the next operation is to determine which individuals may be 
properly classified in each of the categories. As each case is studied, the class best 
describing the individual in the area covered by the scale must be determined. 
As the individual is compared with each of the possible classes, the judgment is 
“like” or “not like." Since there is no implication of a hierarchy in a nominal 
scale, no judgments of “greater than” or “less than” are involved. 

When information is tallied by hand, it is convenient to make tallies in groups 
of five, thus facilitating later counting. In making the fifth tally in any group, we 
strike through the first four, showing that the group of five is complete. An alter- 
nate method is to make little boxes with the first four tallies, and to cross the 
box for the fifth. Thus, 1= 1; _=2; С=3; 0 4; and Ӣ = 5. Both 
methods are shown here and in Example 4.1. 

Counting. The number of cases in each category is the frequency (denoted as f. 

Determination of N. The sum of the frequencies is, naturally enough, N, or 
total number of cases used in the investigation. 


Computation of Percentages and Proportions. To compute percentages for the 
several categories, f is merel 


i y divided by N, pointing off the results so that the 
quotient is, in effect, multiplied by 100. Computation of proportions is exactly 
the same, except that there is no multiplication of the quotient by 100. In either 
case, any appropriate number of decimal places may be retained. 

With a calculating machine, multiplication is generally easier than division. 
The Teciprocal of N, 1/N, is set into the keyboard. To effect division, this figure 
is then multiplied by the several frequencies, f. 

Frequencies are both positive and integral. Percentages and proportions are 
Positive but not necessari 


ly integral. 
Listed in the table bel 


ow are 149 individuals assigned places on a scale of 
eye color. 
CATEGORY TALLIES Dem oe "d 
Brown m m jn 60 40 .403 
Mn un m 
Hun umm 
Bh 
i: Hn un un 42 28 282 
Hf un un 
UI LH II 
Gray-Green IH un | 
26 17 174 
UAT wu 1 
Hazel un ци 
21 
Ш Wm 14 141 
TOTAL i49 55 —— 


Note: Modal class — brown; VIN (reciprocal of N) = 1/149 = .90671 
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The alternate system of writing tallies for the same data would be as follows: 


CATEGORY TALLIES — 
Brown aaaeda B 60 
Пт ЛЫЙ йй 
Blue guuuHu Л 42 
A B K 
Gray-Green йй ай Йй ! 26 
Hazel ий a A | 21 


proportions, there is often “rounding” error in 
umulative effect of adding a series of numbers 
ff to the nearest digit. In the example, there is 
entages, but the proportions happen 


In summing percentages and 
the final digit resulting from the с 
of which all have been rounded o 
an error of 1 percent in the sum of the perc 
to sum to 1.000 precisely. 

A Note on the Degrees of Freedom. T 
the case when information is being ta 
freedom" has no particular pertinency. However, 
arranged in categories, as will be explained later, 
freedom (or df) is often essential. 

If n is the number of categorie 
(п — 1) is the number of degrees of freedom. 

In the present example, when N is fixed, df 
three groups plus information on the total samp. 
information on all four groups is a simple instance О: 


f N is unknown or has no limit (as is often 
bulated), the concept of the “degrees of 
in advanced work with data 
knowledge of the degrees of 
s in a nominal scale, then, when N is fixed, 
= 3. The fact that information on 
le leads directly to complete 
f the concept of df. 


MEASURING RELATIONSHIP WITH THE CONTINGENCY COEFFICIENT 


The degree to which two nominal scales vary together can be assessed 
with the contingency coefficient, denoted as C. It increases as individuals in 


certain categories on one nominal scale are more likely to appear in certain 
categories on a second nominal scale. If the distribution on one scale has 
no relationship to the distribution on the second scale, с = .00. If the 
Woiitingency coefficient ii not zero, it is positive. Negative values are 


impossible. 
y restricted to unordered data. It is 


Actually, the use of C is in no wa 
applicable to finding the relationship between any two sets of observations 


that have been grouped into categories. However, when both sets of 
categories have order, or are measured in units, types of correlation to be 
described in later chapter are generally more useful than contingency. 
Accordingly, the C coefficient is used chiefly when one of the variables or 


both are in unordered or nominal categories. 
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Data of Example 2.2, reflecting the status of patients one year after 
discharge from a mental hospital, illustrate C. The three categories of a 
simple nominal scale for psychiatric diagnosis are: "schizophrenia," 
the "affective psychoses” (manic depressive insanity and involutional 
melancholia), and “psychoneuroses.” The two categories for status are 
“unimproved” and “improved.” These latter two categories may reason- 
ably be considered as constituting a simple scale involving ordered data, 
since improvement is better than lack of improvement. However, whether 
these categories are regarded as ordinal or merely as descriptive without 
evaluation, the procedure is identical. 


EXAMPLE 22 - 


СОМРОТАТІОМ ОҒ С ТНЕ СОМТІМСЕМСҮ СОЕЕҒІСІЕМТ 


Source of Data. Status of 417 mental patients one year after discharge from a 
mental hospital, as reported by Pascal et al. (5). 

Analytic Formulation. C is computed from a two-way frequency diagram in 
Which each case is classified in one of the categories of a first scale and also in 
one of the categories of a second scale. 

Categories need not be ordered. There may be any number of rows and columns 
Corresponding to the categories in the two variables. In the particular 3 x 2 


example (three rows and two columns of primary information) entries can be 
designated as follows: 


UNIMPROVED IMPROVED TOTAL 
Schizophrenia fo fo fr 
Affective psychoses fo fo tr 
Psychoneuroses fo fo Ж 
TOTAL ifs fe N 


t all categories are numbered so that every frequency can be explicitly identi- 
пен д double subscripts denoting joint frequencies involving one category on 
е and another category on a second scale, then the preceding entries are: 


(1) (2) 


UNIMPROVED IMPROVED TOTAL 
а) Schizophrenia 
А fu 
(2) Affective Psychoses fa % 5 
(3) Psychoneuroses fa faz ўз 
r3 
TOTAL fa fca N 


It is apparent that there are thre 
е the marginal frequencies in th 
frequencies row b 


€ types of frequencies in the diagram. The f, 
€ rows. They are found by summing the cell 
y Tow, and show the total distribution for one scale. 
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The f- are the marginal frequencies in the columns. They represent sums of 
Mes frequencies within columns and show the total distribution for the second 
scale. 

* The fo are the observed cell frequencies. They constitute the primary informa- 
tion from which the fz, the fr, and N are found. It will be noted that Ef; = Efe = 
УР = М. 

Formula 2.4 for C is 


Each fy is to be squared and divided by the product of the two corresponding 
and fe. The sum of all values of 7217,7: is found and divided 
d from 1. The square root of the result is C. 
tep is to form a table of //с products, one 
d and divided by the corresponding frfe 


marginal entries, fr 
into unity. The quotient is subtracte: 

Computation. A. convenient first s 
for each cell. Each f» is then square 


product. The quotients are then summed. 
If a desk calculating machine with a * nonentry " feature is used, numbers can 


be squared in such machine space that the result is ready to use as à dividend 
without the entry of the multiplier into the quotient dials. By planning the work 
so that the machine decimal places remain constant, it is possible to accumulate 
EQ f fa) without writing down individual quotients. On such machines, we need 
to record for each cell only fo and fife. On other machines, we need also to record 


each quotient, fo2/frfc- 
hould be checked prior to other computa- 


Computational Checks. All diagrams s 
tions by summing the three sets of frequencies and noting that all three sums 


equal N. That is, 

ху, = Еј. = Bf № 
be executed before forming the qu 
Effe =N? 


t exceed 4/1 


A second check, which should otients, is 


Tn’, in which л is the smaller 


It attains the maximum value only when 


also found in corresponding categories 


It will be remembered that C canno 
number of the two sets of categories. 
all cases in categories of one scale are 
of the other scale. 


Tabulation of Frequencies 


UNIMPROVED IMPROVED TOTAL 
Schizophrenia 173 91 264 
Affective psychoses 17 41 58 
Psychoneuroses 24 71 95 

214 203 417 


TOTAL 
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Tabulation of Percentages Based on /V 


UNIMPROVED IMPROVED TOTAL 
Schizophrenia 4l 22 63 
Affective psychoses 4 10 14 
Psychoneuroses 6 17 23 
TOTAL 51 49 100 


Tabulation of Percentages Within Diagnostic Categories 


UNIMPROVED IMPROVED TOTAL 
Schizophrenia 65.5 34.5 100 
Affective psychoses 29.3 70.7 100 
Psychoneuroses 25.3 74.7 100 
Numerical Steps in Finding C 
UNIMPROVED IMPROVED TOTAL 
Schizophrenia fo 173 91 264 
fè 29,929 8,281 
ЛЛУ 56,496 53,592 
РАЈ. .52975 .15452 
Affective psychoses fo 17 41 58 
Sè 289 1,681 
She 12,412 11,774 
fff. 02328 14277 
Psychoneuroses di 24 71 95 
м 576 5,041 
fife 20,330 19,285 
ШЫ .02833 26139 
TOTAL 214 203 417 
Using the data in the accompanying tables, 
Ufo = 173 4 
E 173 +91 - 17 - 41-24-71 — 417 Ef, = 264 + 58 + 95 —417 
fe = 214 + 203 —417 М-417 
Ute = 56,496 + 53,592 
rJe s j + 12,412 + 11,774 + 20,330 19,285 = 173,889 
М? = 173,889 а " 
Sor 
У = 


Ff 5298 + 11545 + -0233 + .1428 + .0283 + .2614 = 1.1401 


Computation of C: 


— Nn 
a a жне 1 a 
= | se Ji iia "d = 877i — 1228 = 135 


Interpretation. Yn a 3 x 2 
.35 appears to show definit 


table, C has a maximu: 


Д т value of .707. The value of 
€ relationship between 


the two scales. 
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If there were no relationship between type of psychiatric disorder and 
later status, it would be expected that patient status within each of the 
psychiatric categories would be distributed in the same proportion as in 
the total sample. 

It will be noted in Example 2.2 that of the total group of 417 patients, 
51 percent were reported unimproved and 49 percent improved. Accor- 
dingly, it would be expected, under the condition of no relationship 
between diagnosis and later status, that 51 percent of the schizophrenics, 
51 percent of those with affective psychoses, and 51 percent of the psycho- 
neurotics would be unimproved, while 49 percent in each diagnostic group 
would be improved. 

That this is not true is apparent from an inspection of the percents 
based upon frequencies within diagnostic categories. Over 65 percent of the 
schizophrenics (as contrasted with an expectancy of 51 percent) are 
unimproved. On the other hand, less than 30 percent of those with affective 


psychoses and psychoneuroses are unimproved. 


TWO HYPOTHETICAL EXAMPLES [n] 
Two hypothetical cases of the relationship between nominal scale I 
(categories A, B, C, and D), and nominal scale П (categories E, F, G, and 
H) are given as Examples 2.3 and 2.4. In Example 2.3, the frequencies of 


scale I are distributed within the categories of scale II in exact accordance 
with the total distribution of scale I, and vice versa. Thus, of the 35 cases 
F; 14 in G; and 7 in H. It will be 


in category A, 7 are in category E;7in 
EXAMPLE 23 | 


The hypothetical two-way distribution of two nominal scales with no associa- 


tion is given in the accompanying table. 


SCALE П 


SCALE I E F GH TOTAL (fr) 
7 7 14 7 35 

^ 3 3 б 3 15 

е 5 5 10 5 25 

D 5 5 10 5 25 

TOTAL (fe) 20 20 40 20 N=100 


Computing C: 
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readily seen that this type of relationship holds throughout the table. 
Accordingly, if this relation is typical of the population and if information 
as to where an individual falls on scale II is unknown, knowledge of the 
category in which he falls in scale I will give no inkling of his status on 
scale П. Scale I and scale II are independent and have no association. 

In Example 2.3, where the cell frequencies for any category in either 
scale are exactly proportional to the marginal frequencies for the other 
scale, each of the cell frequencies can be reproduced by multiplying together 
the f,, or marginal frequency in the same row, and the /., or marginal 
frequency in the same column, and dividing by М. This value, f,f./N, is 
really f,, the cell frequency to be expected by chance when there is no 
relationship between the variables. In Example 2.3, each f,, or observed 
frequency, is equal to the corresponding f.. 

In Example 2.4 the situation is different. Here the distribution according 
to scale I is the same as in Example 2.3, but the relationship between the 
two scales is such that if we know the category on scale I, the category on 
scale II becomes known. If the sample used in constructing the table in 
Example 2.4 is truly representative of the population from which it was 
drawn, then we know that an individual in category A on scale I is neces- 
sarily also in category F on scale II, that an individual in category B is 
necessarily in category H, and so on. The relationship between the two 


Scales is perfect, and the scales may be considered identical, at least in 
this sample. 


EXAMPLE 2.4 


The hypothetical two-way distribution of two nominal scales with perfect 
association is given in the accompanying table. 


SCALE II 
SCALE I BOF O H TOTAL (fr) 
A 0) 35 100 35 
B 0- л бф 15 15 
С 0 0.25 0 25 
D 25 0. 0 0 25 


TOTAL (fc) 25 85 25 15 N — 100 
Computing C; 
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THE NATURE OF С, THE CONTINGENCY COEFFICIENT 

When, in each cell, f, equals the corresponding f; it can be shown that 
UFZ)! = 1.00. This follows from certain algebraic relationships. When 
LLIN =f, =f, then by multiplication by N, ff, = Nf, Accordingly, 
making simple substitutions and in the second term dividing both numer- 


ator and denominator by /„ we have 


72 dri ftl 1 
o= = =—Yf,=—N=1 
Lif YN, YN Cd N 
The formula for the contingency coefficient is 
C= p ANS 
ia (2.4) 


when each f, is equal to its 


Inspection of this formula shows that 
o. This follows from the fact 


corresponding f,, the coefficient will be zer 


that, in that instance, 1/E( fJ) = 1/1 = 1. Ж) 
The opposite situation exists when the frequencies 1n each category on 


one scale are concentrated in single categories on the second scale, as in 
Example 2.4. Here, f, = f, = fe and each f, [f, f. = 1.00. ӘСЕ 
When the relationship between two scales is perfect, so that classification 
within one set of categories is predictable from knowledge of classification 
in the other set, the number of categories, designated as т, must be the 
same for both. Since there will be a value of АҒУ, for each pur of 
categories in the two scales, and since this value will be 1.00, ЖБ SA 


will equal n’. y ibl 
This relationship is the basis for the formula for the maximum possible 


12 
Cmax = Ji = Ж (2.5) 
п 


value of C for any n': 
By Formula 2.5 certain maximum values of C are as given in the 


following table. fC iti ful to compare it with 
In interpreting any obtained value of C, it 1s usetu І i 
rpreting any us, an obtained С of .60 when n' = 2 would 


the maximum C possible. Th | : ; 
indicate a higher degree of relationship than when n’ =9. However, 


2 
! It should be noted that 0/22/77) and х/% are alternate ways of indicating the iden- 


tical quantity. 
2 If the n’ for the two variables differ, 
categories. 


n' in Formula 2.5 refers to the smaller number of 
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when numbers of categories are different, there is no commonly recog- 
nized method of comparing two obtained C's. 


n Cmax 


-707 
.816 
866 
894 
913 
926 
935 
943 
949 


SOMmIDANALWY 


PREDICTION WITH CATEGORICAL DATA: THE EXPECTANCY CHART 


Whenever information obtained by studying a sample is used as a basis 
for predicting the behavior of individuals, it is assumed, tacitly or ex- 


plicitly, that the sample is truly representative of individuals later to be 
encountered, 


In simplest form, prediction from categorical data involves: 
1, Knowledge of the relationship, in a representative sample, between 
two variables; . Р 
2. Classification of an individual by determining that he belongs in a certain 
Category of one of these variables; and 
3. From these two ascertained facts, inferring the probability of the 
individual falling in one of the categories of the second variable. 


The process is illustrated in Fig. 2.1. In the sample of 417 cases studied, 
48.7 percent Showed improvement one year after discharge. However, 
improvement was Shown by only 34.5 percent of the schizophrenics, 
compared with 70.7 percent of those with affective psychoses, and 74.7 


PSYCHIATRIC 


DIAGNOSIS N PERCENT SHOWING IMPROVEMENT 


"angela аи 


Affective Psychoses 58 


Psychoneuroses 95 


TOTAL 
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percent of the psychoneurotics. While it is exceedingly unlikely that any 
subsequent sample would show these identical percentages, it can be 
estimated that in future samples approximatety 35 percent of the schizo- 
phrenics will show improvement, as contrasted with approximately 
70 percent of those with affective psychoses and about 75 percent of the 
psychoneurotics. This is what is known as group prediction. From the 
proportion observed in a sample, the proportion in subsequent samples is 
predicted. Such prediction has been quite successful in forecasting highway 
accident rates, mortality rates by sex and age, and the like. 

Prediction in individual instances follows identical logic. Using the 
relationship shown in Fig. 2.1, it can be said that if an individual is a 
schizophrenic about to be discharged from a hospital, the probability of 
his showing improvement a year from now is about .35. On the other hand, 
if he is a psychoneurotic under the same circumstances, the probability 
of his showing improvement is about 75. If only the categories of 
“improvement” or “no improvement” are used in the evaluation of 
outcome, any individual prediction will be either right or wrong. However, 
by predicting no improvement for the schizophrenics and improvement for 
the psychoneurotics, predictions will generally be right for a long series of 


cases. 
Figure 2.1 shows fore 
improvement a year after discharge) from a 
(diagnostic category). To improve prediction, 
take more information into account. It is possi 
previous mental illness, physical condition, 
attitude of family toward the patient would be helpful in increasing the 
predictability of the criterion. Especially important, since it would con- 
tribute to making decisions with regard to the patient, would be infor- 
mation on the differential effects of various types of post-hospital care. 
When predictors are measured in units that can be added; when the 
relationships between sets of measurements are best represented by 
straight lines; and, as always, if the trends in the sample adequately 
represent trends in the unknown population, an excellent statistical 
solution exists for the prediction of a criterion from a number of predictors. 


This is the technique of multiple correlation, treated in Chapter 7. Some- 
transformed into simple scales, with 


times categorical information is 
pairs of classes being treated: as the presence or absence of a trait. Such 
nable to treatment by multiple correlation. 


information then becomes ame 1 
more nominal scales is used directly 


Sometimes information on two or MO у : 
in the prediction of a criterion. A practical difficulty in predicting from a 


number of nominal scales simultaneously is that the number of sub- 


categories increases rapidly as each new scale is added. In predicting a 
two-category criterion from four nominal scales, two with five categories 


casting a single criterion (improvement or no 
single predictor variable 


it would be necessary to 
ble that data on age, sex, 
occupational level, and 
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each and two with four categories, the total number of subcategories of 
predictive information would be 5 x 5 x 4 x 4 = 400. For an average of 
25 frequencies in each of the cells, prior to division according to the 
criterion variable, observations on 10,000 cases would be needed. Unless 
it is possible to develop generalizations that will group categories from 
two or more scales, such prediction is cumbersome. 


PREDICTION FROM TWO NOMINAL SCALES JOINTLY 


If two scales are predictive of a criterion and yet are independent or partly 
independent of each other, prediction from the two operating jointly is 


more effective than from either alone. Such is the case with the data 
presented in Example 2.5. 


EXAMPLE 


JOINT PREDICTION FROM TWO NOMINAL SCALES 
(Eye Color of Father and Eye Color of Mother) 


TABLE 2.1, PREDICTION OF CHILD'S EYE COLOR FROM THAT OF FATHER 


CHILD'S EYES 
PROPORTION OF 

FATHER’S GRAY- PREDICTIONS 
EYES BLUE GREEN HAZEL BROWN PREDICTION CORRECT 
Brown 16 11 2 55 Brown 55/85 (p = .647) 
al 2 0 6 2 Hazel 6/10 (p = .600) 
e 12 15 1 9 Gray-Green 15/37 (p = .405) 

ne 41 11 4 7 Blue 41/63 (p = .651) 


еч. Of 195 college students, 117 reported that they have the same eye 
color as their fathers. If this sample is representative of a population in which 


We wish to predict eye color, we can say that the probability of a child having 
the same eye color as his father is .60. 


Ж 
ABLE 2.2. PREDICTION OF CHILD’S EYE COLOR FROM THAT OF MOTHER 


CHILD'S EYES 

MOTHER’s PROPORTION OF 

EYES GRAY- PREDICTIONS 

BLUE GREEN HAZEL BROWN PREDICTION CORRECT 

Brown 

Hazel n 3 3 47 Brown 47/72 (p = .653) 

Gray-Green u is 7 8 Hazel? 7/23 (p — .304) 

Blue i i 3 11 Сгау-Сгееп 15/40 (р = .375) 
1 T Blue 41/60 (p = .683) 


2.5 
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TABLE 2.3. RELATIONSHIP BETWEEN EYE COLOR OF MOTHERS AND FATHERS 
FATHER’S EYE COLOR 


MOTHER'S GRAY- 
EYE COLOR BLUE GREEN HAZEL BROWN TOTAL 
Brown 19 13 3 37 72 
Hazel 5 4 4 10 23 
Gray-Green 12 9 2 17 40 
Blue 27 11 1 21 60 
TOTAL 63 37 10 85 195 


h the data in Tables 2.1 and 2.2, no relationship 
ws some association between eye color of 
ch an association exists is reasonable, as 
dency of individuals with similar racial 
tion in this sample represents a trend 
le, or whether it can be ascribed to 


Remarks. In contrast wit 
is apparent here. The value of C sho 
the two parents. A hypothesis that sui 
the relationship could result from a ten 
backgrounds to marry. Whether the associa 
in the population represented by the samp! 
sampling error, is treated in Chapter 3. 


TABLE 2.4. EYE COLOR OF CHILDREN IN RELATION TO EYE COLOR OF BOTH 


PARENTS 
(Within each cell, frequencies of eye color of children are given in th 
hazel, gray-green, blue, and total.) 

FATHER’S EYE COLOR 


e order: brown, 


MOTHER'S CHILD'S EYE GRAY- CORRECT 
EYE COLOR COLOR BLUE GREEN HAZEL BROWN PREDICTIONS 
Brown B 6 74 24 324 
Hazel 1 1 1 0 50/72 
Gray-Green 3 3 0 3 (p — .694) 
Blue 9% 2 0 2 
TOTAL 19 13 3 HA 
Hazel Brown 0 0 0 8 
Hazel 2 0 44 1 17/23 
Gray-Green 2 0 0 0 (р = .739) 
Blue 14 4а 0 it 
TOTAL 5 4 - 10 
Gray-Green Brown 1 1 0 98 
Hazel 0 0 1 2 24/40 
Gray-Green 4 T 0 4 (p = .600) 
Blue Ле 1 1e 
TOTAL 12 9 2 Z 
Blue Brown 0 1 
Hazel 1 E ^ 2 41/60 
д 2; = 
Оку Сгееп one m 1e 114 (р = .683) 
TOTAL 27 11 1 21 
CORRECT 41/63 23/37 8/10 60/85 132/195 
PREDICTIONS (p = -651)| (р = 622) (p = .800)(p = .706) (p = .677) 


а Predictions according to rules stated in the text. 
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In Table 2.1 are exhibited the cross tabulations of eye color of father 
and child. Inspection reveals a marked degree of relationship. Compu- 
tations for the contingency coefficient are not shown, but C = .60 compared 
with a maximum possible value of .87. The best generalization over all 
categories is that the child tends to have the same eye color as his father. 
In this table, 117 cases fit the rule; 78 do not. 

In Table 2.2 a similar cross-tabulation is presented of the relationship 
between the eye color of mother and child. Again the best generalization 
to be abstracted from the data is that the eye color of the child is likely to 
be the same as the eye color of the parent (in this tabulation, that of the 
mother). It is to be noted, however, that the hazel-eyed mothers have 
eight children with brown eyes compared with only seven children with 
hazel eyes. Accordingly, a brief could be made for predicting brown-eyed 
children in such cases. However, the difference is only a single case, and 
could represent inadequate sampling. The discrepancy is not large enough 
to require revision of the generalization. Of the 195 cases, 110 agree with 
the stated principle, contrasted with 85 exceptions. The contingency 
Coefficient for these data is .55, again indicating substantial relationship. 

Table 2.3 shows a different situation. If there is a tendency for the eye 
color of mothers to be associated with the eye color of fathers, it is slight.? 
This is confirmed by a contingency coefficient of .27. 

In the study as a whole, the two independent variables, eye color of 
father and eye color of mother, are both predictive and more or less 


independent, so that we have reason to believe that the combination will 
be more predictive than either singly. 

In Table 2.4 is presented the trivariate distribution of eye color of 
mothers, fathers, and children. The presentation should be considered a 
three-dimensional diagram and, as such, could be shown as a set of four 
two-dimensional diagrams, one for each of the four categories of one of 
the scales. 


4. of the table gives a basis for joint prediction. Within each of the 
ee a by parent eye color, the most frequent color of children’s 
eyes helps in the formulation of an appropriate generalization. 


F our generalizatior S base о be pertinent for mak ng 
sed on the sample seemt pe t i 
pr edictions 2 | | 


T. 

If both parents have the same eye color, predict that color (67 of 77 
Instances correct, or 87 percent). 

2. If one parent is blue-eyed, predict blue eyes (34 of 66 instances correct, 


—— 


3 It should be remembered that an р 
і і у Observ 1 
in which the observations are made. Th. ed relationshi; 


p may be a function of the group 
associated with eye color of fathers in 


e eye color of mothers might be very substantially 
one subracial group and not at all in another. 
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or 51 percent). Note: Blue-eyed fathers and hazel-eyed mothers in 
this sample do not follow this rule, but the generalization is apparently 
as good as any, 

3. If one parent is brown-eyed and neither is blue-eyed, predict brown eyes 
(26 of 43 instances correct, or 60 percent). 

4. If one parent has hazel eyes and the other gray or green eyes, predict 
blue eyes (5 instances of 6 correct, or 83 percent). Note: Instances are 
too few to be comfortable about this generalization. It can be regarded 


as tentative, pending further observations. 


These results are summarized in the table. In this sample of 195, rules 
based on two independent variables lead to 132 correct placements of 
children in eye color categories, compared to 110 and 117 correct place- 
ments when information on only one parent is used. The increase of 
accuracy is of the order of 20 percent, and may be considered sub- 


stantial. 


DIFFERENTIAL EFFECTIVENESS OF PREDICTION 


f a nominal scale may be more predictive than 
s shown in Table 2.1, 65 percent of brown- 
while only 41 percent of gray or 
-eyed fathers. Accordingly, with 


It is to be noted that parts o 
other parts. For example, a 
eyed children had brown-eyed fathers, 


green-eyed children had gray or green 4 
brown-eyed fathers there is better information about the probable eye 


color of their children than with gray or green-eyed fathers. With joint 
prediction, the same situation applies. With two parents with identical 


eye color, prediction should be correc 
while with the rule of predicting blue eye 
eyed, a rate of only about 51 percent of correct pre 


pated. 
The principle of differential likelihood has wide application in clinical 


psychology. When a few facts about a patient are reasonably predictive, 
as in some kinds of mental deficiency, decisions may be made quickly. 
When, however, the information is in categories or parts of scales that 
are not highly predictive, and when there is hope that further investigation 
will yield new and pertinent facts, it may be appropriate to postpone a 
decision until further information is available. en 

Research applicable to clinical psychology consists In large part of 
finding ways of classifying people in categories and of determining the 
improvement to be expected under different types of treatment. Classifi- 
cation is by no means limited to nominal scales, although nominal scales 
are almost always involved. As probabilities of desirable outcome 
increase in defined categories, a rational basis for clinical practice is 


established. 


t in about 87 percent of the cases, 
s when only one parent is blue- 
diction is to be antici- 
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NEED FOR CROSS-VALIDATION 


When rules for prediction are developed in one sample, they cannot be 
considered as established until tested in one or more subsequent samples. 
The process of trying out on a new group a predictive method developed 
in one group is called cross validation. Because of fluctuations in the 
composition of successive samples, fluctuations in the success of the 
predictive system are to be anticipated. In addition, predictive rules in 
one sample may capitalize on associations that occur in that sample, but 
may not occur in the population. Accordingly, when the rules developed 
in one sample are applied to cases previously unobserved, prediction is 
likely to be less effective than in the sample used in formulating the rules. 


SUMMARY 


The concepts of N, frequencies within classes, percentages and proportions, 
permeate descriptive statistics. While various statistical procedures based 
on ranking or measurement along a scale in terms of additive units are not 
applicable to categorical data, all types of observations can be arranged 
in categories and treated by methods described in this chapter. Accordingly, 
these methods are fundamental in descriptive statistics. 

The matter of determining from categorical data the likelihood that the 


obtained distribution is in accordance with some hypothesis is treated in 
the next chapter. 


EXERCISES 
1. The following data are reported by Hunter (3): 


Distribution of Entries in Various Sections of 
Psychological Abstracts, Volume 25 


ENTRIES 
1. General and statistics 1024 
2. Physiological psychology 397. 
3. Receptive processes 779 
4. Response processes 252 
5. Complex processes 897 
6. Developmental psychology 505 
7. Social psychology 1149 
8. Clinical psychology 1215 
9. Behavior deviations 2030 
10. Educational psychology 857 
11. Personnel psychology 387 
12. Industrial psychology 371 


Convert the frequencies by section to 


i percentages (nearest whole number) 
and to proportions (correct to three pla 


ces of decimals). 
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2. The number of impairments for males and females in the United States has 


been reported as follows (7): 


NUMBER OF IMPAIRMENTS 
(IN THOUSANDS) 


MALE FEMALE 

Blindness 382 578 
Other visual impairment 1,053 1,011 
Hearing impairments 3,276 2,547 
Speech defects 706 392 
Paralysis 487 453 
Absence of fingers or toes 1,195 233 
Absence of major extremities 210 72 
Impairment of lower extremities 1,823 1,331 
Impairment of upper extremities 1,023 659 
Impairment of limbs, back, or trunk 2,433 2,593 

582 777 


All other impairments 


ALL IMPAIRMENTS 13,170 10,646 


als (the male group and the female 


Within each group of impaired individu 
pe of impairment. 


group), compute the proportion that have each ty; 


cidents, Barmack and Payne (1) report the fol- 


3. In a study of 138 highway ac 
site to driver condition: 


lowing data relating accident 
DRIVER CONDITION 


ACCIDENT SITE NOT DRINKING DRINKING 


Straightaway 26 30 
Curve 8 37 
15 22 


Intersection 
For each accident-site category, compute the proportions of drivers in each 


driver condition category. 


on feelings about supervision of 


4. Wickert (8) reports the following answers 
ntrasted with 48 employees who 


96 employees still with the company as co 
have left: 


SUPERVISION SUPERVISION 

REGARDED REGARDED 

AS POOR OR AVERAGE AS GOOD 
Still with company n x 


Have left company 
For each group find the proportion regarding the supervision as good. 
ng data on the relationship between two cate- 


5. Kurtz (4) presents the followi 
oring system and success as a district manager 


gories of a special Rorschach sc: 
for life insurance sales: 
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POORER BETTER 
MANAGERS MANAGERS 

Zero or positive Rorschach signs ll 12 

Negative Rorschach signs 9 9 


Compute and interpret C. 


Gleason (2) reports the following two-way frequency distribution showing the 
relationship between scores on an information test on the Taft-Hartley law 
and attitude toward the law. While this information can be considered scaled, 


it is amenable to treatment by methods applicable to categorical data. Com- 
pute and interpret C. 


SOAREN ATTITUDE TOWARD TAFT-HARTLEY LAW 
INFORMATION 
TEST FAVOR OPPOSE NO OPINION 
10-13 93 137 53 
7-9 155 198 126 
2-6 75 118 133 


7. Ina study on the prediction of attrition in trade school courses, Patterson (6) 
reports the following results: 


PREDICTED CLASSIFICATION 
ACTUAL CLASSIFICATION 


FAIL PASS TOTAL 
Pass 14 156 170 
Fail 36 92 128 
TOTAL 50 248 298 


Compute and interpret C. 


8. The following data are reported by Pascal et al. (5): 
Status of 264 Schizophrenics One Year after Discharge from Mental Hospital 


UNIMPROVED IMPROVED 


Paranoid 87 36 
Catatonic 38 38 
Hebephrenic 29 3 
Simple 5 6 
Mixed 14 8 


To study the relationship between type of schizophrenia and prognosis, 
compute C. What is the maximum possible value of C for a 5 x 2 diagram? 


N 


. GLEASON, JOHN G., “Attitude vs. 


. HUNTER, WALTER S., 


. PASCAL, G. R., SWENSON, 


. PATTERSON, C. H., “The pre 


. WICKERT, FREDERIC R., ** Turnover, and emplo: 
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AN INTRODUCTION 
TO CHI SQUARE 


3 


THE FUNCTION OF CHI SQUARE IN COMPARING THEORY 
AND FACT 


Statistics that merely describe a 
terest only. Theoretical psycholo 
principles and laws that relate to 
beyond specific samples and in 
sented by them, the chi!- 
the most serviceable tech 
frequencies within categ 
types of scales employe: 
This chapter is conce 
ferences from categoric 
not affected by the type 
themselves are based. 
cussion of the logic of 
be considered further. 


particular sample are of preliminary in- 
gy requires generalized knowledge, that is, 
instances not yet observed. In generalizing 
making inferences about populations repre- 
square test (the symbol for which is x?) is one of 
niques yet devised. Although it applies usually to 
ories, it is useful with data collected with all the 
d in psychological research. А 
rned principally with using chi square to make in- 
al data. Procedures with chi square, however, are 
of measurement or description on which the classes 
Later on, in Chapter 12, in connection with a dis- 
various chance distribution curves, chi square will 


eae ee 
1 Chi rhymes with “ 


try" and is Pronounced with a hard “с” or “k” sound. The “h” 
is silent. 
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STEPS IN TESTING HYPOTHESES WITH CHI SQUARE 


The development and testing of hypotheses involving data tabulated in 
categories require seven steps: 


1. Within some area of investigation the formulation of a hypothesis that 
relates to how frequencies are distributed into categories; 


2. Collection of pertinent observations; 
3. Classification of these data into categories and determining the actual 


class frequencies, denoted as the /,’s; 

4. Determination of how the data would be distributed if the basic hypo- 
thesis were true, thus finding for each class the fẹ, the expected or theo- 
retical frequency; 

5. Computation of x^, which is zero when each f, equals its corresponding 
/„ and increases as differences increase; 

6. Determination of the likelihood that the particular value of > (or some 
greater value) would occur if each f, differed from the corresponding fe 


only by chance; and vA 
esis if the value of x? is so large that it 15 


7. Rejection of the basic hypoth 4 
highly improbable that it could have occurred by chance; or contin- 
ity if х2 is small. 


uing the basic hypothesis as a possibil 


THE NULL HYPOTHESIS 
Generally speaking, evaluation of the difference between the f,’s and ГАС 
which is most often not the basic hypo- 


is in terms of a “ null hypothesis,” А 
thesis of the study (since the experimenter generally seeks true differences 
in frequency between categories). In connection with 77, the null hypothesis 
states that there is no difference between theoretical expectancy and empiri- 


cal findings beyond what might reasonably be expected to occur by chance. 
The null hypothesis can never be proved, since if no difference i pound ne 
series of investigations, à difference might still be found in an investigation 
yet to be conducted. On the other hand, if the differences between empirical 
findings and theoretical expectancies are greater than can reasonably be 
expected by chance, the null hypothesis is regarded as disproved, апа tis 
necessary to take the view that something more than chance is responsible 
for the differences. 1 

Actually, the null hypothesis is never completely rejected. It is rejected 
at a stated “level of significance," usually the 5 per cent level or the 1 per 
cent level. If an investigator is willing to make wrong decisions as often as 
5 times in 100, by rejecting the null hypothesis when it should not be re- 
jected; the 5 percent level is selected as critical. If one wishes a still higher 
degree of certainty, the 1 percent level can be used. Of course other levels 
can be selected, or in any given study, the precise probability can be stated 
that the difference between what is expected according to the basic 
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hypothesis and the empirical findings represents a parameter value greater 
than zero. 

As are other statistics of inference to be discussed in later chapters, chi 
square is not a direct measure of the degree of relationship. It is used in 
obtaining an estimate of the likelihood that some factor other than chance 
is operating in the area under investigation and is causing the obtained dis- 
crepancies between observed facts and the basic hypothesis. When the 
numerical value of chi square is larger than can reasonably be expected by 
chance, doubt is cast on the principle by which the theoretical frequencies 
were set up, and perhaps the principle will be actually rejected. If repeated 
investigations show no discrepancies beyond those that might reasonably 
be accounted for by chance, the principle receives support and may in time 
gain general acceptance. 


In the case of data within categories, the following are examples of prior 
hypotheses that can be tested with chi square: 

1. The hypothesis that cases are uniformly distributed among categories. 
In working with a multiple-choice maze, the hypothesis might be that with- 


Out prior training, white rats are equally likely to choose any of, say, four 
alternative routes. 


2. The hypothesis that cases follow a predetermined distribution. 
According to Mendelian principles of heredity, dominant and recessive 
characteristics at a certain stage of selective mating will be distributed in 
the proportion of three dominants to one recessive. With an appropriate 
sample of cases, the fit of Observations to theory can be tested by chi Square. 


CHI SQUARE IN TESTING THE PRESENCE OF ASSOCIATION 
A second use of chi s 


coefficient С. 


In testing association with chi square, the theoretical frequencies are, in 
effect, Provided by the null hypothesis rather than by the hypothesis in 
which the Investigator is interested directly. In a study of the inheritance of 
Ke и 5 example, the investigator really wishes to determine whether 
ibn of the mother is related to the eye color of the child. However, 

18 hypothesis lacks Precision and does not lend itself to statistical testing. 
A better approach is to explore its converse. The first step is to determine 
the frequencies expected within the table if both marginal distributions 
were maintained, but without association between the two variables. If the 
value of chi square based upon the differences between observed and theo- 
retical frequencies is greater than can be attributed to chance at a specified 
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level of significance, we reject the null hypothesis, that is, the idea of no 
association between the variables is rejected. We can then accept the hypo- 
thesis that in the population represented by the sample, there is some asso- 
ciation. No matter how large chi square may be, it never directly assesses 
the degree of such association, since chi square merely tests the presence of 
linkage. Measuring linkage requires other steps, such as finding C. 


CHI SQUARE IN COMPARING TWO EMPIRICAL DISTRIBUTIONS 


A third use of chi square is in testing two obtained distributions to see 
whether the differences between them can be attributed to chance. 
Suppose that we draw two samples of individuals and record their re- 
ligious preferences. The question is whether the two samples have similar 
distributions of religious preferences or whether the samples represent 


essentially different segments of the population. 
The null hypothesis is that any difference between the two samples can 


be ascribed to sampling error. Identical categories must, of course, be 
used for the two samples and the procedure tests whether the differences 
within the distributions of the two samples can be attributed to chance. 


FORMULAS FOR CHI SQUARE 


The basic formula for chi square is 


2 
ы pa T 
e 
ncy and f, is the corresponding frequency 
expected under some hypothesis. Because the differences between observed 
and expected frequencies are squared, chi square 15 always positive. Since 
each squared difference is divided by the expected frequency, chi square 1s, 
in a general way, a weighted average of the squared discrepancies. If there 
are no discrepancies at all, x? is 0. As discrepancies increase, chi square in- 
creases, its maximum value increasing with the number of cases and de- 
creasing with the number of categories. In working with chi square, it will 


be remembered that 

Lf = Уј, =N (3.2) 
oretical frequencies equals the sum of 
both sums equal N. 


in which f, is any observed freque 


that is, the sum of the expected or the 
the observed frequencies, and of course 


COMPUTING FORMULAS FOR CHI SQUARE 
While Formula 3.1 indicates the nature of chi square, the following com- 


puting formula is usually more convenient: 


7 (3.3) 
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To develop Formula 3.3, each ( f, — f.) is ex 
and divided by f,, yielding ( fe f. — 2f, 
summed, the result is (x f/f) — 2 х7, 
Ef, = N, the expression can 


panded to ( f,? — 2f, f. + f.) 
+ fj. When this expression is 
tLf. Since, by Eq. 3.2, ®/ = 
be simplified to Formula 3.3. 


р (3.4) 
апа ; 

a LU 

x 100557 7 иа 


PERCENT FREQUENCY 


40-383 Chi Square Distribution for 
df=2 
35 
30 
25 
23.9 


b 
x! -5.99 

15 Р = .05 because Бу сһапсе only 

5 percent of x?'s will exceed 5.99. 

10 


x? =9.21 
P=.01 because by chance only 


1 percent of x?'s will exceed 9.21 
5 


VALUES OF 4? 
FIG. 3.1. THEORETICAL DISTRIBUTION IN HISTOGRAM FORM OF x”s F 
з AND f/s (gr 


ROM 
- 2, SINCE TWO OF THE DEVIATIONS AR 


E 
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percent f than in the original frequencies, in order to reduce effects of 
rounding error. 

In testing the independence of two variables, it is not necessary to com- 
pute the expected frequencies because we can use directly the row fre- 
quencies, the column frequencies, and the N on which the f, are based. 


Substitution of f; /,/N for f, in Formula 3.3 yields 


2 VS Nf? Js di 
=f = = N=N =N= ee 
ae жаш УТ? Кайт (5:% 1) eo 


DISTRIBUTIONS OF CHI SQUARE 
For a statistic to be useful in making inferences about the population, its 
distribution must be known. One such distribution is shown in histogram 
form as Fig. 3.1, the chi square distribution for two degrees of freedom. 
In finding the theoretical distribution of any statistic, it is assumed that, 
under the fixed set of conditions defining the statistic, only “chance” 
operates to produce variation. It can be shown that “chance,” or better 
“probability,” operates according to definite principles. If the possible 
range of variation of a statistic under stated conditions can be established, 
together with the relative frequency with which different values occur in the 
total plurality of values, the distribution of the statistic can be said to be 
established. This is accomplished by setting up the mathematical function 
that meets the stated conditions. The relative frequencies of different values 
can then be calculated. The use of these relative frequencies of chi square 
in comparing fact and theory is illustrated in Examples 3.1 and 3.2. 


EXAMPLE 31 _ 


A x? PROBLEM WITH 2df 


2 when df — 2 is derived to fit the condition 


The theoretical distribution of x с 
variate or when marginal frequencies 


that, when N is known for a three-category 
are known for a 3 х 2 ora 2x3 bivariate? distribution, there are two and only 


two categories in which the frequencies can vary independently and freely within 
the overall limitations. This distribution in histogram form is shown in Fig. 3.1. 

Consider a market survey in which 300 persons are asked to state their pre- 
ference for one of three automobiles. The results are given in the accompanying 


table. 
PREFERRED BY 
CarA 115 = fa 
Car B 87 — fn 
Car C 98 — fc 
TOTAL 300 = N 


2 А > n : : 
The term bivariate refers to two variables considered simultaneously. 
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In a single variate distribution upon which a single restriction has been placed 
(in this case, a fixed N), the number of degrees of freedom is one less than the 
number of categories. Accordingly, in this example, df = 2. This means that 
within the limitation of the total number of cases, two categories are more or 
less independent. When N and the frequencies of any two categories are known, 
the frequency of the third category is readily deduced. Since N = fa + fa + fc 
it is easy to find any missing value from the other three values. 

Two hypotheses are adopted: 

1. The first is that in the population represented by this particular sample, 
preferences are evenly distributed. Theoretical or expected frequencies are always 
established on the basis of some tentative principle, and if the theory of even 
distribution is adopted, each f; would be 100. Like the fo, the fe must add up to М. 

2. The second is the “null hypothesis,” which is tested directly by x?. In this 
case it states that there are no deviations from the theoretical frequencies beyond 
those that can be accounted for by chance. 

It is permissible to compute x? in either of two ways, by Formula 3.1 or by 
Formula 3.3, the latter being preferred when theoretical frequencies are decimal 
fractions or when a calculating machine is used. In the table below, all computa- 
tions are written out in full. If the calculating machine has the “попепігу” 


feature So that numbers may be squared without affecting the quotient dials, 
X(fo"/fe) can be found 


© without recording any of the f;"s or quotients of the type 
So e. 
PRE- COMPUTATIONS BY COMPUTATIONS BY 
FERRED ` FORMULA 3.1 FORMULA 3.3 


CR fo fe (—f) (—R* Qolf: E fe 


А 115 100 15 225 2.25 13225 132.25 
87 100 —13 169 1.69 7569 75.69 

С 98 100 -2 4 04 9604 96.04 
By Formula 3.1, 3.98 X(fo?/fe) = 303.98 


3.1, x2 = 
By Formula 3.3, x? = 2(fo2/fe) — М 
= 303.98 — 300 = 3.98 


The obtained chi square of 3.98 must be compared with the theoretical distri- 


bution of chi-square values that would be obtained purely by chance. Inspection 
of the curve for two degrees of freedom in Fig. 3.1 shows that (5.3 + 3.2 + 2.0 + 
1274-342 + .2) percent, or 13.5 percent, of chi-square values would 
be greater than 4.00. Accordingly, P, the probability of obtaining purely by 
chance, a Chi-square value as great as or greater than the obtained value is .135. 
In other Words, there is approximately one chance in seven that a chi Square as 
large as 3.98 could be obtained by chance alone if the a priori hypothesis of even 
distribution of choices among the three cars were true. 


Р a P value that will cause rejection of the null hypothesis and thus cast 
ош 


| t on the original theory depends on how much of a risk of an incorrect 
decision the Investigator is willing to run. The most commonly accepted “levels 
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утту in psychological research are the so-called 5 percent (meaning 
a oe ds or less) and the 1 percent level (meaning that P equals .01 or 
“ he e ese levels are accepted as a basis for rejecting the null hypothesis, 
A the ing I in 20, or 1 chance in 100, respectively, of an incorrect decision. 
аа present instance, where P =.135, we probably would not regard the x? 
т ng significant deviation from chance expectancy. In an unlimited popula- 
n and with perfectly even distribution of choices among the three cars, chi 
Squares as great as 3.98 would occur about one-seventh of the time. Accordingly 
Í "n inclined to retain our tentative principle that in the population represented 
e sample, there is no definite trend toward one car or the other. 


EXAMPLE 3.2 


A SECOND x? PROBLEM WITH 2df 


and show the results of a study of the 


ain patients during psychotherapy and 
for each cell. 


The following data are from Ellis (5) 


para between improvement in cert 
eir desire to achieve adjustment. The frequency fo is shown 


CONSIDERABLE TOTAL 


LITTLE OR NO DISTINCT 
IMPROVEMENT (fr) 


IMPROVEMENT IMPROVEMENT 


Considerable 


desire to achieve 
adjustment 0 ы ы. = 
MEN, ee PS 
Moderate to no desire 
to achieve adjustment 10 Ы à с 
achieve adjustment | 10 | T ee 
с EE T T 19 N=40 


8 In а two-way table, in which the hypothesis to be tested is that there is no 
We de ba between the two variables, the f; within each cell can be computed by 
ultiplying the f; for the row by the fe for the column and dividing by N. For the 


Cells in the upper row, the ffc аге 200, 220, and 380. For the cells in the lower 
ce Formula 3.6 does not require fe, 


M the // are also 200, 220, and 380. Sin 1 
nly the ff. are needed. Squaring each fo, dividing by the corresponding frfc, 


and summing yields 
fe 16 ,256 100, 49 9 — 
Se т 4 
LR = 20 + 380 * 200 + 520 * 380 
By Formula 3.6, x2 =N EIS) — П- 
is Since there are two rows and three columns, 
‘df =(r—1)(c—1)=1 x2 


f 2. 
т. 247, Fig. 3.1 can again be use 
-71 is well beyond the upper 1 percen 


40(1.4928 — 1) — 19.71. 
the number of degrees of freedom 


d to find P. Inspection shows that a x? of 
t of the values, and hence it is significant 
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at better than the 1 percent level. Accordingly, the null hypothesis of no associa- 
tion between the variables is rejected, and we are inclined to believe that there is 
association. 


The degree of association is measured by the contingency coefficient. By the 
procedures of Chapter 2, 


1 1 EN қ 
C= ERN) = 6699 = 4/3301 = .57 
px i ҮЛ — .6699 = ,/330 


It will be remembered that the distribution of chi square for 2df is only one 
of a number of chi-square distributions, and that in testing any particular value 
of chi square, it is essential to use the curve or values from a table appropriate to 
the df. As shown in the histogram of Fig. 3.1 with 2df, approximately 39 percent 
of obtained values of chi square are expected to fall between 0 and 1.0, approxi- 
mately 24 percent between 1.0 and 2.0, approximately 9 percent between 3.0 
and 4.0, and so on. If there is no underlying association between the values of 
the frequencies and the expected values, 5 percent of the chi squares will still 
exceed 5.99, Similarly, if the association between expected and obtained fre- 
quencies is only by chance, about 1 percent of the chi-square values will be 


PERCENT FREQUENCY 


Chi Square Distribution for 
df=4 


152 


x? =9.49 

Р-.05 because by chance 
only 5 percent of x? s will 
exceed 9.49, 


х?=13.28 
Р = .01 because by chance 
only 1 percent of x?'s will 
exceed 13.28. 


8 9 10 11 12 13 14 15 16 17 


VALUES OF x? 


IN HISTOGRAM FORM OF x*'s FROM 
4, SINCE FOUR OF THE DEVIATIONS ARE 


FIG. 3.2. THEORETICA 
UNRELATED /,'s AND 
INDEPENDENT) 


L DISTRIBUTION 
fes (df = 
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greater than 9.21. This illustrates what is meant by the 5 percent and 1 percent 
levels of significance. If a computed chi square is 5.99 or over, such a value could 
be obtained on the basis of purely chance association only once in 20 times; 
hence the P value is .05. Similarly, for values of chi square of 9.21 or more, 
P = 01, ог more precisely, P < .01 because by chance alone only | percent of 


chi squares will exceed 9.2 1. 


The function yielding distributions of chi square was discovered by 
Pearson about 1900, and tables of the principal chi-square curves were pre- 
pared by W. P. Elderton. The concept of the degrees of freedom in relation 
to chi square was worked out by Fisher, who, together with Yates, pro- 
duced a new set of chi-square tables oriented in terms of df and stated 
values of P rather than in terms of the number of categories and stated 
values of 2. For convenience, chi squares found in actual research are 
evaluated in terms of these tabled values rather t 


mathematical function. 


han by direct use of a 


PERCENT FREQUENCY 
Chi Square Distribution for 


df=6 


x? =12.59 

Р-.05 because by chance 
only 5 percent of x?'s will 
exceed 12.59. 


x^-16.81 
Р-.01 because by chance 
only 1 percent of x?'s will 
exceed 16.81. 


10 11 12 13 14 15 16 17 18 19 


01234526789 а | 
ALUES OF x 


ON IN HISTOGRAM FORM OF X?'s FROM 


FIG. 3.3. THEORETICAL DISTRIBUTI 
NCE SIX OF THE DEVIATIONS ARE INDEPENDENT) 


UNRELATED fo's AND fe's (df = 6, SI 

The distribution in Fig. 3.1 (72 for 2df) should be compared with the 
distribution in Fig. 3.2 (x? for 4df) and the distribution in Fig. 3.3 (x? for 
6df). It will be noted that as the degrees of freedom increase, numerical 
values of chi square tend to increase, as might well be anticipated. The shape 
of the distribution changes, becoming more and more symmetrical. Beyond 
df = 30, the distribution of chi square is regarded as symmetrical and may 
be evaluated by means of tables of a symmetrical curve, as will be described 


in a later chapter. 
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DETERMINING THE SIGNIFICANCE OF x? 


When we have the actual 7? distributions (or close approximations as in 
Figs. 3.1, 3.2, and 3.3), we can determine by inspection whether a given 
х? represents a sufficiently unusual occurrence to indicate that something 
more than chance is operating to produce the discrepancies between 
observed and theoretical frequencies. Here, these distributions are available 
for 2df, 4df, and 6df merely to show the nature of the distributions and to 
clarify the logic underlying the 72 test. In actually carrying out research, 
one would want to find the significance of obtained chi squares from a 
Source more compact than a series of charts, one for each df. 

Figure 3.4, showing selected P functions for values of chi square up to 
50 and for 1 to 30 degrees of freedom, can be used in evaluating obtained chi 
Squares. 

Entry is by means of chi square on the ordinate (vertical, or y axis) and 
the degrees of freedom on the abscissa (horizontal, or x axis). From the 


relation of the intersection of the lines from the two entries to the P value 
Curves, an idea of P may be obtained. 


VALUES OF x? 
50 


40 


30 ui 


10 i 


0 i LU Е Hindi 522 i 
123456789 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
DEGREES OF FREEDOM 


FIG. 3.4. DIAGRAM FOR FINDING SIGNIFICANCE OF Р ш 
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A few examples of the use of this graph are given below: 


LOCATION OF POINT 


df x OF INTERSECTION INTERPRETATION 
2 10.50 Between P = .01 .01 > P > .001 : Discrepancies highly signifi- 
and Р = .001 cant. Null hypothesis rejected at 1 percent 
level of significance. 
5.40 Below Р = .10 Discrepancies not statistically significant 
7 18.30 Near P = .01 Table C (Appendix) should be consulted 
for exact value of x? for P = .01 
8 17.50 Between Р = .05 05> P ».01: Null hypothesis rejected at 
and P=.01 5 percent level of significance 
10 2.00 Below P — .10 Discrepancies not statistically significant 
14 5.10 Below Р = .10 Discrepancies not statistically significant 


In general, however, an obtained chi square is evaluated by means of a 
table such as Table C (Appendix), which shows how large a chi square 


must be to be significant at the 5 percent and 1 percent levels. 
Illustrative problems involving the computation and interpretation of 


X? with 4df are given in Examples 3.3, 3.4, and 3.5. 


EXAMPLE 3.3 


A xy? PROBLEM WITH of = 4 


Another chi-square distribution curve is shown as Fig. 3.2, this one for four 
degrees of freedom. Like Fig. 3.1, its use is illustrative only, since in practice the 
significance of an obtained x? is found from a chart such as Fig. 3.4 or from a 


table such as Table C (Appendix). 
The x? curve for 4df is useful in eva 


theoretical frequencies when four categories (| 
to vary independently. With a single categorical variable, this situation obtains 


when М is fixed and there are five categories. With two nominal variables, the 
number of cells for observed frequencies will be 2 by 5, 3 by 3, or 5 by 2; all of 
which, by the (r — 1) (c — 1) principle, yield 4df. 

Testing Whether a Single Nominal Variable Differs fromana priori Distribution. 
In a market survey of soft drink preferences, 60 respondents might give first 
preferences as indicated under fo in the table below. 


fo fe fefe 


luating differences between obtained and 
or, rather, four deviations) are free 


Brand I 14 12 16.333 

Brand II 12 12 12.000 

Brand III 11 12 10.083 

Brand IV 9 12 6.750 

Brand V 14 12 16.333 
2 

xt = 61.499 


fe 


64 AN INTRODUCTION TO PSYCHOLOGICAL STATISTICS 


By Formula 3.3, 


2 
x? =} > — N = 61.499 — 60 = 1.499 


The basic hypothesis in this case would be that there are no differences in 
preference for the five brands. Accordingly, in each case f; is 12. The null hypo- 
thesis states that any discrepancies between the / and the f; result from chance. 
With М fixed, four categories are left to vary, and accordingly, df — 4. Chi 
Square is 1.499. Inspection of the curve for 4df and of the chart in Fig. 3.4 
indicates that P is very high. Inspection of Table C (Appendix) shows that x? 
for four degrees of freedom would have to be 9.488 to be significant at the 5 per- 
cent level of confidence. Accordingly, there is no reason to reject the null hypo- 
thesis, and it appears possible that the differences among the brands may result 
from sampling. 

It should be remembered, of course, that the sample is small. Brands I and V 
have about 50 percent more first preferences than brand IV. If these same pro- 
portions were obtained in a considerably larger sample, it is likely that x? would 
be found to be significant. In all cases, the question investigated is whether (in 


the particular set of data) discrepancies between theory and observation can be 
ascribed to chance. 


EXAMPLE 3.4 


A PROBLEM IN A 5 x 2 TABLE 


In a study of the relationship of son's occupation to father's occupation, 
Jenson and Kirchner (6) reported numbers of sons following father's occupation 
and not following father's occupation for five occupational groups as shown in 
the table. Again, in each cell, fo is given with f;f. in parentheses below. 


SONS NOT SONS IN 
OCCUPATIONAL IN FATHER'S FATHER’S TOTAL 
GROUP OF SONS OCCUPATION OCCUPATION (fr) 
Operatives and allied 1,542 428 1,970 
(9,499,340) | (3,282,020) 
Craftsmen, foremen, and 1,156 632 1,788 
allied (8,621,736) | (2,978,808) 
Managers, officials, and 773 402 1,175 
proprietors (except farm) (5,665,850) (1,957,550) 
Professional and technical 664 149 813 
(3,920,286) (1,354,458) 
Clerical and allied 687 55 742 
(3,577,924) (1,236,172) 
TOTAL (fc) 4,822 1,666 М = 6,488 
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Check: ХР. = 9,499,340 + 8,621,736 + 5,665,850 + 3,920,286 + 3,577,924 + 
3,282,020 + 2,978,808 + 1,957,550 + 1,354,458 + 1,236,172 = 42,094,144 = № 
2 


Y = 1.046438. а-а 


By Formula 3.6, 
х= NES fS) — 1] 


= (6488) x (.046438) = 301.29 
c — 1 -1E IfA) = V1 — 1/1.046438 
= 4/17 9556 = 4/.0444 = 21 
Xf." ff) is computed in the usual way for contingency tables. First, the fr 
and f; are multiplied together in pairs, and the f;fe products аге entered in the 
cells under the fo. The /,/: are summed as a check, since fife = №. Then each fo 
is squared and divided by the f;f- product. The sum of the quotients is Ev ff). 
The chi square of 301.29 is far beyond the graphed and tabled values, and is highly 
significant. The chance that there is no association between the two variables is 
exceedingly minute, and the null hypothesis can be rejected with considerable 


assurance. 

However, as is always the case, the value of x? is a function of N. With 6488 
Cases, a very small degree of association would be found to be statistically 
significant, in that the association would be too great to be ascribed merely to 
random variation in the particular sample. In this example, the relationship 
between occupational group and tendency to follow father's occupation is rather 
low, as shown by a contingency coefficient of .21, compared with a maximum 


Possible C of .707 in a 5 x 2 table. 


EXAMPLE 3.5 


Ax? PROBLEM IN A 3 x 3 TABLE 


Another example with 4df is based on data from Cohen (3) and involves a 
comparison of a psychologist’s Wechsler-Bellevue pattern diagnoses and the 
corresponding neuropsychiatric criterion diagnoses. Three hundred cases were 
studied in all, with 100 in each of the following classes: psychoneurotic, schizo- 
phrenic, and brain-damaged. The results obtained are given in the accompanying 


table. 


_ In each cell, fife is written under 
is squared and divided by the appropr 


the cell frequency (fo). When each fo 
iate fife, the result is D(fo"/frf-) = 1.067. 
x? = NIES ÈIS) — 1] = (300) (.067) = 20.10. From the curve of x? for 4df, it is 
apparent that P is less than .01. Accordingly, the null hypothesis is rejected at 
better than the 1 percent level, and it can be believed that there is association 
between the two variables in the population represented by the sample. 

The degree of association in the sample is measured by the contingency 
coefficient of .31, compared with a maximum possible of .816 for a 3 x 3 diagram. 
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NEUROPSYCHIATRIC CRITERION 
PSYCHOLOGIST’S PSYCHO- SCHIZO- 


BRAIN- TOTAL 

DIAGNOSIS NEUROTIC PHRENIC DAMAGED [0/2] 

Brain-damaged 20 28 43 91 
(9100) (9100) (9100) 

Schizophrenic 24 33 29 86 
(8600) (8600) (8600) 

Psychoneurotic 56 39 28 123 
(12,300) (12,300) (12,300) 

TOTAL (fc) 100 100 100 N = 300 


EXAMPLE $6 | 


A PROBLEM WITH 6df 


A third histogram showing chi square under the condition of no correspon- 
dence between obtained and theoretical frequencies is exhibited as Fig. 3.3. 
It is for six degrees of freedom; that is, six of the deviations are independent. 
Accordingly, it can be applied to test whether the distribution of a single variable 
with seven categories is in accordance with some hypothesis, or it can be used to 
test association in contingency tables of the following forms: 2 x 7,3 x 4,4 x 35 
and 7 x 2, 

The data in the table, from Patterson (8), represent passing and failing in 
Seven trade school courses. In each cell, f; is given with /,/. in parentheses below. 


SCHOOL COURSE 


FAIL PASS TOTAL (fr) 

Automobile general 22 37 59 
(5723) | (9853) 

Building construction, 11 22 33 

drafting, and estimation (3201) (5511) 

Electric general 12 29 41 
(3977) (6847) 

Machine shop 8 25 33 
(3201) (5511) 

Mechanical drafting 19 12 31 
(3007) (5177) 

Printing 6 13 19 
(1843) (3173) 

Radio and electronics 19 29 48 
(4656) (8016) 

TOTALS ( f.) 97 167 N — 264 


By x? it is possible to test wheth i iati 
er there is an association between t e 
and success in training. каше 
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The ff. are checked by summing and comparing with N?. 
Xf. 69,696 = N? 


Then, X (2) = 1.04463 


е=м®(#) — 1] — 264 x .04463 = 11.78 


Figure 3.3 shows that with 6df, more than 5 percent of chi squares obtained by 
chance exceed 11.78. Accordingly, by the conventional standard, there is no 
need to reject the null hypothesis of no association between type of course and 


success in training. 


COMBINING CATEGORIES WHEN fe ARE SMALL 


The way in which the distributions of y? are derived places restri 
its use. 


In the first place, theoretical or expected 1 
small. The dividing line “too small” is arbitrary and varies with different 


mathematical statisticians, from 5 to 20. The concensus favors 5 as the 
smallest f, permissible. Sometimes the only recourse is to gather sufficient 
data so that all cells have a minimum f, of 5. In other cases, however, where 
categories may be logically combined, the total number of categories can 
be reduced by consolidation, making all cells meet the requirement. Such 
is the situation in Exercise 8 at the end of this chapter, where certain infor- 
mation from Example 2.5 has been revised. In Example 2.5 there were too 
few cases of hazel eyes to permit the chi square test to be made in the total 
4 by 4 table. By combining the categories of brown eyes and hazel eyes 
(various shades of off-brown), the test could be appropriately applied. 

It should be noted that there is no restriction on the size of fj, which may 


be zero. 


ctions on 


frequencies must not be too 


YATES’ CORRECTION FOR CONTINUITY FOR 147 

A special case comes іп а 2 х 2 diagram when опе f. is less than 5. Here 
there is no way to combine categories. Instead, the * correction for con- 
tinuity" developed by Yates is applied. It brings the computation of x 


into closer harmony with its mathematical development, which involves a 


continuous function rather than discrete frequencies. A numerical example 


is given as Example 3.7. 


70 AN INTRODUCTION TO PSYCHOLOGICAL STATISTICS 


Test the hypothesis that differences among the first four candidates are the 
result of sampling differences only. 


2. In a study of students entering Wellesley in 1892, Calkins (2) reported the 


following data on the relationship between pseudo-chromesthesia and mental 
“forms.” 


NOT 
REPORTING REPORTING 
PSEUDO- PSEUDO- 
CHROMES- CHROMES- 
THESIA THESIA TOTAL 
Reporting 
mental “forms” 44 17 61 
Not reporting 
mental “forms” 127 15 142 
TOTAL 171 32 203 


Test the association of pseudo-chromesthesia with mental “forms” by use of 
chi square. 


+ Inan experiment on perception, Bruner and Minturn (1) reported the following 
data: 


NUMBER OF SUBJECTS DRAWING A BROKEN-B 
STIMULUS FULLY OR PARTLY CLOSED, OR OPEN, 
UNDER THREE CONDITIONS OF EXPECTATION 
SEEN FULLY OR 


EXPECTATION 


PARTLY CLOSED SEEN OPEN 
Number 2 22 
Number or letter 8 16 
Letter 16 8 


Test with chi square whether there is a relationship between expectation and 
Perception of the stimulus. 


Compute C directly and compare the result with C as found by Formula 3.7. 


Ina study of interpersonal trust and communication, Mellinger (7) reported 
the following data (slightly modified) on the relationship between accuracy 
In perceiving another's attitude and actual level of agreement in attitude. 


ACCURACY OF LEVEL OF AGREEMENT 


PERCEIVING Оог1 2 3 4 TOTAL 
4 6 5 28 40 79 
3 8 21 64 13 106 
2 8 25 3 4 40 
Oori 17 0 0 Ж 19 


TOTAL 39 51 95 59 244 
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By chi square, test whether there is a relationship between accuracy of per- 
ceiving another’s attitude and actual level of agreement. 


. Sloan and Harmon (9) reported that of 332 feeble-minded with mental ages 
less than three who showed changes in I.Q. on retest, 102 gained in I.Q. and 230 
lost. Test the hypothesis that in the population represented by the sample, 


half will gain and half will lose in І.О. 
Of the 102 gainers, 51 were male and 51 were female; of the 230 who lost in 


І.О., 132 were male and 98 were female. Test whether there is an association 
between sex and direction of change in I.Q. 

. At a psychological research unit during World War П (4) recommendations 
for different types of training for aviation cadets and for student officers were 
made as follows: 


STATUS 
TYPE OF AVIATION STUDENT 
TRAINING CADETS OFFICERS 
Bombardier 1,401 47 
Navigator 3,393 65 
Pilot 15,775 813 


Using the chi-square test, determine whether the type of training recom- 


mended was related to status as aviation cadet or student officer. 


- Two samples of 5000 aviation cadets each reported previous flying experience 


as follows (4): 
SAMPLE A SAMPLE B 


Held pilot’s private or commercial license 648 509 
Student pilot certificate, solo privileges 411 494 
Student pilot certificate 175 206 
Had been passenger in plane, no instruction 2984 2910 
Never had been passenger in plane * ы 


Previous military flying instruction 


Test the null hypothesis that there are no differences in previous flying exper- 


ience in the two samples. 

odified from Example 2.5, hazel eyes being combined 
th fe below 5. Test with chi square whether 
f father and eyes of mother. 


- The following data are m 
with brown eyes to avoid cells wi 
there is association between eyes o 


FATHER'S EYES 
GRAY- HAZEL- 
MOTHER’S EYES BLUE GREEN BROWN TOTAL 
Hazel-Brown 24 17 54 95 
Gray-Green 12 9 19 40 
Blue 27 11 22 60 
63 37 95 195 


TOTAL 
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DESCRIPTION BY RANKING 


4 


THE NATURE OF ORDINAL DESCRIPTION 

A type of description of much interest in psychology is ranking, the basis 
of what is often called the ordinal scale. The adjective “ordinal” refers 
to order or rank as the fundamental characteristic of the quantitative 


description involved. 

Essentially, an ordinal scale is use 
order according to the degree to whic 
as proficiency in tennis, ability to spea 
Devices for assessing handwriting or com 
are arranged in order of merit, are examples of formal ordinal scales. 

With nominal data, numbers are used chiefly for N, for frequencies, for 
proportions and percentages, and for values of statistics such as chi square 
and the contingency coefficient. Only occasionally is a number used to 
designate a class or category, and then the number carries no implication of 
rank or value. With ordinal scales, however, numbers are not only used 
for statistics, but also to describe categories as having more or less of some 


attribute or characteristic. 


d whenever individuals are placed in 
h they possess a characteristic, such 
k French, or skill in argumentation. 
position, in which key specimens 


JUDGMENTS WITH NOMINAL AND ORDINAL SCALES 
In using a nominal scale, as described in the preceding chapter, an in- 
dividual is judged as to whether he should or should not be assigned to a 
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certain category. If he cannot be placed in the first category examined, 
other categories are considered until one is found in which he can be 
correctly placed. The act of judgment required is essentially that of 
differentiating between “equal” and “not equal"; that is, whether or not 
the individual possesses the characteristic or attribute differentiating the 
class from other classes. 

A formal ordinal scale generally requires these same judgments of 
equality, but in addition, since the numbers describing the categories 
represent relative magnitudes, judgments of “more than” or “less than” 
are essential both in building the scale and in using it. With an informal 
ordinal scale, as when a supervisor is asked to rank the men under him as 
to their usefulness to the company, discriminations of “better than” and 
“poorer than" or “more than” and “less than" are needed. Essentially, 
then, ordinal measurement in psychology involves judgments as to 
variation in the degree to which different individuals possess a characteris- 
tic, but there is no implication of measuring the characteristic in equal units. 
On a formal ordinal scale, all that is assumed is that of the characteristic 
described by the scale, one category involves more or less than another 
category. When individuals are ranked directly, there is implication of a 
characteristic that varies in degree, but the question of how many units of 
the characteristic possessed by each individual is ignored. 


THREE METHODS OF RANKING 


In practice, the use of ranking as measurement in psychology involves one 
of three procedures: 
1. The establishment of a formal ordinal scale in which categories are 
defined (possibly by means of samples) and arranged in order. Each in- 
dividual case is then judged as to the category to which it belongs. Order- 
of-merit scales for evaluating work products are of this type. 

2. Ordering a group from high to low or from low to high with regard 

за defined characteristic. It is а matter of convenience whether low or 
high numbers Tepresent the greater merit, but it is generally the former. 
The Scoring system for cross-country meets is based upon the order in which 
individuals finish the run, and hence uses an ordinal scale of this type. 

is The application of a device yielding numerical scores of some sort 
хода of items right, ог some function of time or errors) and 
ннен ing the scores as though they correctly differentiate individuals 

10 relative order, with ties permitted, but with no requirement that the 
original Scores represent units. This is a method frequently used with 
psychological test data and constitutes the foundation of the use of per- 
centiles in descriptive statistics. 

All the statistics applicable to nominal scales (N, class frequencies, per- 
centages and proportions, the mode, the contingency coefficient, and chi 
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square) apply generally also to ordinal scales (although some of these would 
not ordinarily be used with a set of N ranks, as described below). In addi- 
tion, there is a new family of descriptive statistics which can be used with 
ordinal data but not with nominal categories. Of these, the most funda- 
mental is rank order, with the percentile as probably the statistic of widest 


application. 


FOUR DESCRIPTIVE STATISTICS BASED ON RANKING 
The use of order as a means of description makes possible four important 
statistics. 

1. Rank or rank order. The М individuals in a sample may be assigned 
ranks, beginning with 1 and continuing through 2, 3, 4, and so on to N, 
using as the basis for the assignment some estimated or measured charac- 
teristic. Thus, each individual is described by his rank order in the sample. 
If the same individual is ranked in the same reference group with respect to 
anks in different characteristics may be com- 
c, however, the meaning of a 
led and the size of the sample. 
athematics in a liberal 


several characteristics, the r 
pared numerically. For a given characteristi 
rank changes both with the population samp! 
To be the best student in a section of freshman m 
arts college may not represent the same attainment as being the best in a 
freshman section in an engineering college. Placing second in a group of 10 
may not represent the same achievement as placing second in a group of 


200. Accordingly, when rank is used to reflect a psychological characteris- 
tic, both size and composition of the group in which rank is determined 


Should be stated, as well as the nature of the measuring device. 

If two or more individuals in a group are tied, the customary procedure 
is to make an adjustment so that the sum of the ranks is maintained. Sup- 
Pose, for example, that after the first two individuals have been ranked, two 
are tied for third place. The two vacant ranks corresponding to the two 
ties are averaged, giving each the rank of 3.5. The next individual would be 
assigned the rank of 5. The series would thus be: 1, 2, 3,5, 3.5, and 5. 
Similarly, if these five were followed by three individuals tied for next place, 
the vacant ranks would be 6, 7, and 8, and the three would be assigned the 


average rank (sometimes called the midrank) of 7. 
rank. In order to make ranked data com- 


2. Percent position or percent 
parable, irrespective of the size of the reference group, rank orders may be 
or “percent rank," in which higher 


translated into “регсепі position” р п 

values represent the more desirable standing. The conversion formula is 
100(N + .5 — R) 

Percent rank — N (4.1) 


in which R is the original rank (with 1 at the more desirable end of the 
scale) and N is the number of cases in the sample. 
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By formula 4.1 a person standing fourth in a group of 11 would have 
a percent rank of 100(11.5 — 4)/11, or (to the nearest whole number) 68. 
The interpretation is that he is theoretically higher than 68 percent of the 
group but that 32 percent are higher than he. The term theoretically refers 
to the fact that the individual’s own rank is partly assigned to the higher 
group and partly to the lower. This is a convention by which the entire 
array of ranks is divided into just two groupings instead of three, and by 
which the percent of N belonging to the individual's own rank is eliminated 
from influencing the statistic. Actually, percent ranks are exactly the same 
in theory and in interpretation as the percentile ranks to be described 
below, except that they are derived directly from ranks instead of from 
measures that have been placed in order. 

3. A percentile may be defined as the point below which a given percent 
of the scores or values in a frequency distribution theoretically fall. The 
theory of the frequency distribution when data are treated as ordinal, the 
method of making frequency distributions, and the computation, interpre- 
tation, and application of percentiles form the major portion of this 
chapter. 

4. A percentile rank is closely allied to the percentile. There is an impor- 
tant difference. With a percentile, we start with a certain percentage of N 
and find the theoretical point in the distribution below which the scores 
fall. With a percentile rank, we start with some score or value that actually 
Occurs in the distribution and find the theoretical percentage of the dis- 
tribution that falls below it. 

With percentiles, the percent of N is generally integral, while the equiva- 
nt theoretical score is a decimal fraction; with percentile ranks, the score 
customarily integral, while the percentage equivalent, or percentile rank, 
could be taken to several decimal places. With psychological data, however, 
two digits of value for percentile ranks are generally considered sufficient. 


le 
is 


THE FREQUENCY DISTRIBUTION IN ORDINAL DESCRIPTION 
MAKING A FREQUENCY DISTRIBUTION 


A frequency distribution is a means of classifying the scores of a single 
variable. Scores are grouped in categories defined by step intervals, each of 
which 15 a set of contiguous possible scores. In the frequency distribution 
shown in Example 4.1, the step interval is 5, which means that five different 
but contiguous values are grouped together as a single class. 


DESCRIPTION BY RANKING 77 


EXAMPLE 


4.1 


MAKING A FREQUENCY DISTRIBUTION 


Nature of a Frequency Distribution. A variable with values that indicate the 
order of the cases may be divided into a number of categories or “steps.” Each 
step generally corresponds to a certain number of values on the original scale. 
The number of values in a step is known as the step interval and is denoted as i. 
In the distribution in Table 4.1, i = 5 because a case is tallied in a given step if 
it has any one of five different integral values. For example, any score, 140, 141, 
142, 143, or 144, is tabulated in the top step. 


TABLE 4.1, DISTRIBUTION OF 200 SCORES ON AN APTITUDE TEST 


SCALE DISTANCE CUMULATIVE 
(i/f) FOR EACH FREQUENCIES 

STEPS TALLIES* f SCORE IN STEP cf Су 
140-144 1 5.000 200 1 
135-139 2 2.500 199 3 
130-134 | | 3 1.667 197 6 
125-129 TAY | 6 833 194 12 
120-124 М [II 8 625 188 20 
115-119 NU Il 12 417 180 32 
110-114 ГЫШ ГЕН 15 333 168 47 
105-109 PH МЫ PH 20 250 153 67 
100-104 МЈ МЫ М NN 23 217 133 90 
95-99 М PN МАТЧ | 26 92 110 116 
90-94 — M4] PH] TRU PHI LII 24 208 84 140 
85-89 МІГ NU PHI 20 .250 60 160 
80-84 ТЫГЫ NW I 16 313 40 176 
75-9 ММЧ 10 .050 24 186 
704 М || 7 .714 14 193 
65-69 |1 4 1.250 7 197 
60-64 | 2 2.500 3 199 
55-59 0 1 199 
50-54 1 5.000 1 200 
а The alternate system of making tallies for the first five steps would be: 

STEPS TALLIES 

140-144 | 

135-139 | 

130-134 [ies 

125-129 A | 

120-124 A C 


5 Cumulated upward from the bottom of the distribution. 
© Cumulated downward from the top of the distribution. 


The number of cases tabulated in any step is the frequency, denoted as f. In 
the top step, there is only one tally. Accordingly, /- 1. The exact value of the 
Score represented by a tally cannot be known from the distribution. However, 
the loss of information as to precise values is compensated for by: 
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1. The fact that the general shape of the distribution becomes apparent from 
inspection of the tallies or frequencies; and 

2. The fact that a convenient format becomes available for the computation of 
descriptive statistics. 


Preferred Practices. Although methods used in making frequency distributions 
vary, most social scientists would find the following practices acceptable: 

1. Use of 10 to 20 steps. The use of fewer than 10 steps tends to increase inac- 
curacies in computed statistics; the use of more than 20 steps tends to accentuate 
irregularities in the shape of the distribution. However, the rule is flexible, and 
practice may be modified to fit circumstances. As an example, when cases are 
few, the use of only three or four steps may be advisable in order to bring out 
a trend. 

2. Higher values at the top of the distribution; lower values at the bottom. 
This is merely the convention that applies above the origin to the y axis of 
Cartesian coordinates. 

3. A step interval of 1, 2, 3, 4, 5, 10, or a multiple of 10. These values have 
familiar multiples and hence their use reduces the possibility of error. 

4. Indicated lower limit of each step a multiple of the step interval. In the 
example, the indicated lower limits of the top three steps are 140, 135, and 130. 
The use of indicated limits that are integers makes for ease in tallying. 

5. True lower limit (the “partition value") of each step .5 below the indicated 
lower limit. In the example, the true lower limits of the top three steps are 139.5, 
134.5, and 129.5. The need for the use of the true lower limit may be noted by 
careful examination of any step. Consider, for example, the five values that may 
be tabulated in the top step: 140, 141, 142, 143, and 144. The midscore is 142, 
which must be considered to be one-half of a step interval, or 2.5 units, above 
the true lower limit. Accordingly, the true lower limit must be (142— 2.5), or 
139.5, which is .5 below the indicated lower limit. 

6. Tally marks made in groups of five. Generally, up to four cases are indicated 

by vertical tallies, with the fifth case indicated by a crossline. Alternately, four 
tallies are formed into a Square, which is crossed by the fifth tally. Such grouping 
facilitates counting. 
| Checking the Distribution. In making a distribution, N should be ascertained 
independently of the tallying operation. Then, Xf — М. The tallying operation 
should also be checked separately. Dots may be placed at the ends of tallies to 
indicate the second tabulation. 
Original Data. When original values are used as a means of ranking cases and 
identifying Step limits and percentiles, each case within a step is regarded as 
occupying an amount of scale distance exactly the same as that occupied by 
every other case within the same step. 

For example, the single case in the top step, 140-144, is conceived as occupying 
5 units of the original scale; each case in the second step from the top, 135-139, 
2.5 units; each of the three cases in the third step, 1.667 units; each of the six 
cases in the fourth step, .833 unit, and so on. 

In treating a series of scores as ordinal data, they are not added or multiplied. 
Scale values, however, provide reference points, and the distance between 


of 
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reference points is sometimes treated by subtraction or division, as illustrated in 
Example 4.2. 

Cumulative Frequencies. Examples of cumulative frequencies, denoted as Cf, 
are shown in the last two columns of Table 4.1. In one column frequencies are 
cumulated upward from the bottom of the distribution, and in the other, from 
the top downward. A cumulative frequency is merely the sum of the frequencies 
from one end of the distribution through the step on which it appears. 

In the case in which the frequencies are cumulated from the bottom of the 
distribution, the lowest f is 1. Accordingly, the lowest Cf is 1. The next f is 0; so, 
the second Cf is also 1. The third f is 2; so, the Cfis 1 + 2 = 3. The fourth f is 4; 
so, the corresponding Cf is 3 + 4 = 7. The process continues to the top of the 
distribution, where the final Cf is 200, which is N. This type of cumulation is 
useful in finding percentiles, as illustrated in Example 4.2. 

In the last column of Table 4.1 the frequencies are cumulated downward, a 
procedure sometimes useful in treating interval scale data, as discussed in Chap- 
ter 5. The first fis 1 and the first Cf is, of course, the same. The next fis 2, yielding 
a Cf of 1 4-2, or 3. The third f happens to be 3; so, the third Cf is 3 + 3, or 6, 


and so on. Again, the final Cf is 200, or N. 


The grouping of scores in categories or steps is a measure of considerable 
economy. It provides a means by which the general characteristics of the 
collection of scores can be assessed more or less at a glance. A frequency 
distribution provides a convenient means of recording observed data, 
greatly simplifies the computation of descriptive statistics, and is useful in 
Preparing graphs and charts. | 

Two different assumptions are made in uti 
quency distributions: 


lizing information from fre- 


1. АШ scores within a given step interval are distributed evenly throughout 


that step; and is 
2. All scores within a given step are concentrated at the midpoint. 
are inconsistent. The first, however, is 
followed in description by ranking, as in computing percentiles, while the 
second is used in description by summation, as will be discussed in the next 
chapter. In both cases, however, the true bottom of the step, or the “ par- 
tition value” between it and the next lower step, Is considered to be .5 


below the lowest score that could be tabulated on that step. 


Obviously these two assumptions 


CHOOSING A STEP INTERVAL 

In organizing observations into a frequency distribution, the first step is to 
choose an appropriate interval. If there are too many steps in a frequency 
distribution and N is not large, the curve tends to be irregular. On the other 
hand, if the number of steps is few, computations become inaccurate and 
essential characteristics of the distribution may be obscured. As a convenient 
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working rule, the number of step intervals in a distribution for most re- 
search purposes should be between 10 and 20. 

A good working procedure for choosing an appropriate step interval 
follows. First of all, the range is computed by subtracting the lowest score 
from the highest score and increasing by 1. This gives the total number of 
different scores possible. Suppose, for example, the highest obtained score is 
143 and the lowest is 52. Then 143 — 52 + 1 = 92, which means that within 
the total range, 92 different scores are possible. We now divide 92 by 10, 
yielding 9.2, and also by 20, yielding 4.6. Accordingly, the step interval 
should be somewhere between 9.2 and 4.6 in order to have 10 to 20 steps. 
The most frequently used step intervals are 1, 2, 3, 4, 5, 10, and the higher 
multiples of 10. Here 5 would seem to be a satisfactory choice. In Example 
4.1, the step interval of 5 actually yields 19 different steps. 

Tallying scores in a frequency distribution is essentially the same as in 
using a nominal scale except that each category is defined by two numbers, 
the upper and lower partition values, instead of by a verbal description. 
After all tallies have been made, they are counted within steps or categories 
to find the f and, of course, Xf — N. 


COMPUTATION OF POINT MEASURES 


FINDING A PERCENTILE 


In computing descriptive statistics when scores are regarded as means of 
ranking individuals, point measures are fundamental. The basis for all 
these point measures is the percentile (sometimes called centile). As stated 
earlier, a percentile may be defined as that theoretical point in a distribu- 
tion below which lies a stated percentage of the scores. Thus, the 37th 
percentile, or P37, is a theoretical point (generally a fractional score that 
actually does not exist) below which it is assumed that 37 per cent of the 
distribution lies. 

When percentiles are calculated from frequency distributions, the 
assumption stated earlier is always followed, namely, that within any step, 
all the Scores are considered to be evenly distributed from the lower step 
limit or partition value to the upper step limit. That is to say, the “density” 
of scores at any distance representing one unit in raw score is thought to be 
the frequency for that step divided by the step interval. For example, in 
Table 4.1, the frequency (denoted under f) for the step 105-109 is 20. Five 
possible scores are represented in this step, namely, 105, 106, 107, 108, and 
109. In making the frequency distribution, the number of cases for each 
of these five scores has been lost. It is unlikely that there were exactly four 
cases for each of the five different scores. One or more of the scores may 
not have been represented at all. However, in computing point measures, 
the 20 scores are regarded as distributed evenly over the five possible 
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scores, that is, four cases for each score. The “scale distance” for each 
score in this particular category is .25. By this we mean that if all 200 
scores were arranged in order from the lowest to the highest, and with all 
scores within a category spread out evenly throughout the step, each score 
in this category would occupy .25 of a scale unit. The same assumption 
holds when the number of cases in a given step is not a multiple of the step 
interval. For example, in the step 100-104, in which the frequency is 23, 
the 23 cases would be regarded as divided evenly among the five possible 


scores, namely, 4.6 cases for each unit of the scale, or a scale distance of 


-217 for each case. | 
In a somewhat similar fashion, the variable represented by the scores is 


regarded as continuous even though only integral scores are included in the 
original data. The step interval 100-104, for example, is thought of as a 
continuum from 99.500 to 104.499+, the last figure to be interpreted as 


almost, but not quite, 104.5. 
Under these assumptions, it is relatively e 
by means of percentiles, the other point measures. | 
The procedure for finding any desired percentile may now be considered. 
In the distribution in Table 4.1 there are 200 cases. An illustrative problem 
would be finding the theoretical point below which 37 percent of the cases 
fall, that is, P37. 
Multiplying 200 by .37 yields 74.00. Т 


Scores falling below the 37th percentile, or Рут. l . 
It will be noted that the next to the last column in Table 4.1 is denoted 


as Cf'entries in which are the frequencies cumulated up from the bottom of 
the distribution. This cumulative frequency 15 helpful in finding point 
measures when a number of them are to be found from the same distribu- 
tion. A quick inspection shows that the 74th score lies seo in the 
Category or step 90-94. The Cf column shows that there are | scores 
below this step, while 84 scores include all the scale up to the limit of 
94.4994. We can be sure, then, that P37 is at least 89.5, but not so large as 
94.5. On the assumption that the 24 scores are equally distributed through- 
out the entire step, we can take as many as are necessary to round out the 
needed 74 cases and find the equivalent in score terms. of course not all 24 
Cases are required. The desired number of cases, 74, minus the number of 
cases below the step, 60, yields 14, the number of cases required among the 
24 within the step limits. Accordingly. the fraction of the total step interval 
that is to be added to the bottom of the step 15 14/24. Hence it is necessary 
to add 14/24 of the step interval of 5 (that is, 2.92) to the bottom of the 
step 89.5 to obtain Рз, which in this example is 92.42. 

All of the “point measures” used in statistics are based on percentiles. 
Essentially, percentiles are used to define and to locate points in the original 
system of measurement. These points, in turn, are used to describe central 


asy to compute percentiles and, 


his is the theoretical number of 
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tendency and variability of the original measurements. Computation of 
these measures is shown in Example 4.2. 


THE MEDIAN AS A MEASURE OF CENTRAL TENDENCY 


When a group of numbers represents a series of observations, there is need 
to simplify the information so as better to understand it. If a single number 
can be chosen to represent the series, the information is simplified. The 
three most common measures of central tendency in psychological statistics 
are the mode, the median, and the mean. 

The mode was mentioned in connection with nominal data, where it is 
simply the class of greatest frequency. Since the categories have no fixed 
order, it is not a measure of central tendency. However, with ordinal data, 
the mode reveals something about the location of the middle of the distribu- 
tion, provided the distribution is reasonably symmetrical. For a distribu- 
tion such as that in Example 4.2, we can define the modal class as the most 
popular category. The largest f, 170, is in the step 180-199. Accordingly, 
this is the category that can be regarded as the most typical in the series. 

Several formulas exist for finding a single numerical value for the mode, 
but these need not require the attention of the student of elementary statis- 
tics. Perhaps the most important use of the mode is as a term describing 
frequency curves. A distribution with a single point of greatest frequency is 
unimodal; one with two points of greatest frequency is bimodal, and one 
with several such points is multimodal. Just as a mountain chain can 
have major and minor peaks, so it is sometimes convenient to describe a 
frequency distribution as having major and minor modes. 

Probably the most useful of the measures of central tendency is the mean, 
but since its computation involves addition and division of observed 
values, consideration of its properties will be reserved to Chapter 5. 

. A measure of central tendency that cannot be computed for nominal 
information, but which becomes possible when observations can be identi- 
fied numerically and arranged in order, is the median. The median is defined 
as the theoretical point that divides the distribution into two groups with 
equal frequencies. It is therefore definable as the 50th percentile, above 
Which there are, theoretically, exactly 50 percent of the cases and below 
Which are 50 percent of the cases. Its computation is shown in Example 4.2. 


EXAMPLE 


4.2 


THE COMPUTATION OF PERCENTILES AND POINT MEASURES 


Nature of Data. To compute percentiles, values are grouped in categories in a 
frequency distribution. No assumption is made that the units are actually equal, 
but it is assumed that the original values are adequate for purposes of ranking 


DESCRIPTION BY RANKING 83 


and designating the percentiles. In this example, scores of 874 high school seniors 
on a test of academic achievement have been distributed in 14 categories, using 
a step interval of 20. In all instances, the true lower step limit, or partition value, 
is .5 below the indicated limit. 

Formula for Any Percentile. Steps by which any percentile may be computed 
are indicated in the formula: 


N — Cf(below the step)]i 
P; = lower limit of step + [р f (4.2) 


in which P; is any percentile, pN is the desired proportion (percent divided by 
100) multiplied by N, i is the number of units in the step interval, and fis the 
frequency of the step containing the percentile. The expression *' C/(below the 
step)" refers to the number of cases below the lower limit of the step containing 


the percentile. 
Finding Рә. As an example, 
sometimes called the first quartile: 


consider the operations incident to finding Pes, 


1. Find 25% of М. To do this, 874 is multiplied by .25 to obtain 218.5. This 


is the pN of Formula 4.2. 


2. Determine, from the frequencies cumu 
of the distribution, the interval that contains Pes. The Cf of 327 of the step 


160-179 is greater than 218.5; the Cf of the step below is 185, which is 
less than 218.5. It is therefore clear that the step 160-179 includes the 25th 
percentile. The lower limit of this step, or lower partition value, is 159.5. 

3. From pN is subtracted the number of cases below the step; that is 185 
(the Cf of the step below) is subtracted from 218.5, yielding 33.5. 

4. This value, 33.5, is multiplied by the step interval i, which is 20, and 
divided by the step frequency, or f, which is 142. (In effect, 33.5 ayo 
by 142 to find .2359, as the needed proportion of the interval. Then 


2359i = 4.72. u " 
5. This result en. is added to the “lower limit of step,” 159.5, to find Pes, 


which is 164.22. 


lated in Table 4.2 from the bottom 


Numerical operations for five representative percentiles are shown under the 
distribution in Table 4.2. All percentiles, including the median, which is Pso, or 


the 50th percentile, are computed in exactly the same fashion. | 

Point Measures of Variability. Once the appropriate percentiles have been 
found, point measures of variability are readily determined. Q, the semi-inter- 
quartile range, is the usual point measure of variability and is (Pss — Pss)[2. In 
this distribution, exactly half the cases (when all are arranged in order under 
the conventions of ordinal measurement) lie between the limits of 223.41 and 
164.22. Half this distance would seem to be an appropriate measure of dispersion 


or scatter. 
Another measure of dis} 
and in this case is 111.3. 
Checking a Percentile. 
top of the distribution. Four chan 


persion is D,a modified range that is merely (P99 — Pio), 


Any percentile may be checked by working from the 
ges are necessary, but since the procedure is 
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TABLE 4.2. COMPUTATION OF PERCENTILES 
(The data are the scores of 874 high school seniors on a test of academic achievement.) 


STEPS ў сув 
320-339 1 874 
300—319 5 873 
280-299 14 868 
260-279 40 854 
240-259 70 814 (Step includes 90th percentile) 
220-239 110 744 (Step includes 75th percentile) 
200-219 137 634 
180-199 170 497 (Step includes 50th percentile) 
160-179 142 327 (Step includes 25th percentile) 
140-159 102 185 (Step includes 10th percentile) 
120-139 64 83 
100-119 12 19 
80-99 6 7 
60-79 1 1 
Computations 
90% of 874=786.6 Pop = 239.5 + men = 251.67 
75% of 874 = 655.5 Рз = 219.5 + osm = 223.41 
— 32720 = 192.44 
50% of 874 = 437 Pso = 179.5 + ene 
218.5 — 185)20 
25% of 874 = 218.5 Pos = 159.5 + ( 142 = 164.22 
10% of 874 = 87.4 Pio = 139.5 + ae 140.36 
Median = Pss = 192.4 
Рз — Р; 223; = 2 
o= = 25 _ EINE, е 


D = Poo — Pio = 251.67 — 140.36 = 111.3 


4 Cumulated up, 


exactly analogous, the changes are merely those incident to working from a 
different direction. The logic is identical. 

Instead of the “lower limit of step," the “upper limit" is used. Instead of pN, 
We use (1 — p)N; and instead of “Cf(below the step)" we use the number 
cases above the Step. Instead of adding to the “lower limit of the step," we sub- 


tract from the Upper limit. These operations are summarized as Formula 4.2a 
under the distribution in Table 4.3. 


By this procedure, Po; is found as follows: 


(1 — p)N = (.75 х 874) = 655.5 
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Counting down from the top of the distribution, the step interval containing 
the desired percentile is found. Again it is the step 160-179, the upper limit of 
which is 179.5. Above this step are 547 cases. Thus (655.5 — 547) times 20 and 
divided by the f for the step, 142, is 15.28. When 15.28 is subtracted from the 
ир E a limit, 179.5, the result is 164.22, which is Pss found by the usual 
method. 


TABLE 4.3. CHECKING A PERCENTILE 


STEPS РД cfe 

320-339 1 1 

300-319 5 6 

280-299 14 20 

260-279 40 60 

240-259 70 130 

220-239 110 240 

200-219 137 377 

180-199 170 547 

160-179 142 689 (Step includes P25) 

140-159 102 791 

120-139 64 855 

100-119 12 867 
80-99 6 873 
60-79 1 874 

Computation 
Р; = upper limit of step Ki —pN coe Фенер) (4.2а) 
.15)874 — 547]20 (655.5 — 547)20 
Рә = 179.5 K 142 179.5 142 164.22 


4 Cumulated down. 


POINT MEASURES OF VARIABILITY: THE RANGE, D AND Q 


Of a series of values representing psychological data, a second way in which 
to simplify the information is to report a measure of their variability, or 
spread, or scatter. If all values tend to be more or less identical with the 
measure of central tendency, then there is little variability. Such is the case 
with certain structural characteristics, such as the number of fingers. A 
few individuals have fewer than 10 fingers because of prenatal or postnatal 
accidents, and a few have polydactylism, a condition with more than 10 
fingers. However, the human race as a whole shows remarkably little 
variability in the number of fingers. 

There is more variability in other structural characteristics such as 
height (in which some adults are twice as tall as other adults) and in weight, 
where the ratio of extremes is greater than with height. In psychological 
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traits such as intelligence, there is no way of comparing variability on any 
absolute basis, but from observing the low-grade feeble-minded on the 
one hand, and highly intelligent, creative individuals on the other, one 
would estimate that the variability of the human race in this respect is 
very great. 

With a series of measurement, a measure of variability along with a 
measure of central tendency can be used better to describe or summarize 
the series. On the same measuring device, two groups may be equal in 
central tendency, but very different in variability, or vice versa. 

Three measures of variability that can be computed with ordinal data 
are often useful. The first is the range, mentioned in connection with estab- 
lishing the step interval for a frequency distribution. The second is D, a 
modified range, and the third is Q, the semi-interquartile range. 

The range, as the highest value in the series less the lowest value (plus 
1 if the total number of different possible values is wanted), give the maxi- 
mum variability in the particular sample. Because the two extreme values 
are often determined by single scores, and because what scores happen to 
be the highest and lowest are greatly affected by the composition of the par- 
ticular sample, the range is likely to vary considerably from sample to 
sample. Since successive samples tend to yield inconsistent values, the range 
1 used only as a crude measure of variability or as a help in planning the 
computation of more reliable descriptive statistics. 

The statistic D was proposed by Kelley (1) as a modified range to describe 
the variability of a group of values. It is defined as the 90th percentile less 
the 10th percentile (Poo — Рі). Since Poo and Р, о are measures that are 
more stable than are the two extremes of the distribution, D is more con- 
sistent from sample to sample than is the total range. Although D is an 
easily defended descriptive statistic, it is seldom used, chiefly because the 
total range is more useful for the practical purpose of planning a frequency 
distribution and Q seems to convey more information about the variability 
of the sample. 


The semi-interquartile range Q is defined as half the distance between the 
third and first quartiles, that is, 


Pas — P 
ge (4.3) 


One quarter of all the cases in the distribution are above Р;;, with another 
quarter below Pog: Accordingly, if the distribution is symmetrical, the scale 
distance of 1.000 above the median and 1.000 below it will include 50 
percent of the cases. If the cases are bunched close to the median, it is 
obvious that О will be small, whereas if the cases scatter away from the 
median, Q will be large. Computation of Q, as does the computation of the 
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median, the range, and D, requires original scale values, such as the raw 
Scores on a psychological test, as reference points. From them the numerical 
values of the point measures are obtained. 

In describing several samples of subjects measured on the same variable, 
medians can be used to rank the groups as to which stands highest, which 
next highest, and so on, while Q can be used to compare the variability of 
the groups. Groups that are alike in central tendency may still differ in 
variability, and vice versa. Furthermore, as will be discussed in Chapter 13, 
measures of variability are essential in evaluating obtained differences in 
measures of central tendency, although Q itself is seldom used for this pur- 


pose. 
The point measures provide a convenient method of reducing obtained 


Scores on psychological tests to a form facilitating comparison from in- 
dividual to individual on the same test and from test to test for the same 
individual. 

Sometimes tables of scores equivalents at selected percentile points, as in 
the following table, are given. Data represent the scores of 998 male engi- 
neering freshmen at the University of Minnesota Institute of Technology 


on the Minnesota paper form board test (3). 


PERCENTILE SCORE 
99 62 
95 59 
90 57 
80 55 
75 54 
70 53 
60 50 
50 49 
40 47 
30 46 
25 44 
20 43 
10 39 
5 36 
1 30 


Score equivalents have been rounded, so that no fractions are reported. 
By means of a table ofthis sort, obtained scores are quickly interpreted for 
guidance and other purposes. For example, a young man with a score of 
52 on the test has a score as good or better than two-thirds of freshman 


engineers in the norm group. 


PERCENTILE RANKS 


Another method of norming tests is to provide for each possible score its 
percentile equivalent, or percentile rank, as demonstrated in Example 4.3. 
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EXAMPLE 4.3 


THE CALCULATION OF PERCENTILE RANKS 


The usual formula for a percentile rank computed from a frequency distri- 
bution is 


[C/(below the step) + .5/]100 


(4.4) 
N 


Percentile rank — 


in which f is the step frequency and 100 is the multiplying factor to remove the 
decimals. This gives the percentile rank of the score at the midpoint of the step, 
which is often assigned to all the scores in the step. For more exact work, a step 
interval of 1 can be used. 

An equivalent formula is 


(Cf — .5f)100 


(4.42) 
N 


Percentile rank — 


in which Cf is the cumulative frequency of the step, the cumulations being made 
from the bottom of the distribution. 

Computation. Two methods of calculating the percentile ranks, both with a 
Step interval of 1, are illustrated. The first is for computations by hand. The 
Second method was developed by Thurstone and permits fast computation when 
à calculating machine is available. 

For the hand method, the frequencies in column 2 of Table 4.4 are cumulated 
from the bottom of the distribution to form the Cf in column 3. Next, in each 
Step, one-half of the frequency is subtracted from the Cf to form the (Cf — .5/), 
which are in column 4. These figures are divided by N and multiplied by 100 to 
form the percentile ranks in column 5. 

Machine Computation. In a method described by Thurstone (5), half of the 
reciprocal of N is determined. In this case, N is 179 and 1/2N is .002793. Since 
there are six places of decimals in the half-reciprocal, and it is desired to multiply 
by 100 to remove decimals in the part retained, four places of decimals are marked 
off in the product dials of the calculating machine. The half-reciprocal (or rate) 
is multiplied by each frequency twice, but since only the percentile ranks at the 
mid-points are desired, the result of the second multiplication at each step is 
disregarded. The multiplications begin at the bottom of the distribution, and both 
products and multipliers are allowed to accumulate. The first multiplication of 
:002793 by 4 gives the rounded percentile rank of 1. The second multiplication 
by 4 is disregarded. The next multiplication also happens to be by 4, the fre- 
quency of the second step. As a rounded percentile, it is recorded as 3, the per- 
centile rank of the midpoint of the second step. The process continues, alter- 
nately recording and discarding the cumulated results of the multiplications. 
When the two multiplications by the frequency of the top step have been com- 
pleted, the accumulated sum of the multipliers should be 2N, or 358, and the 
accumulated sum of the products should be 100. 
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In comparing the standing of the same individual on several tests, the 
percentile ranks can be compared to determine the areas in which he is 
Strong and the areas in which he is weak. An entering freshman with a per- 
centile rank of 90 on a verbal intelligence test might be an excellent risk for 
a liberal arts course, but with percentile ranks of 20 or so in numerical 
ability, spatial relations, and mechanical comprehension, he would be a 


poor risk in engineering. 


TABLE 4.4, CALCULATION OF PERCENTILE RANKS 
(The data represents the scores of 179 students on an achievement test.) 


(1) (2) (3) (4) (5) 
SCORE f Cf с/-.5/ PERCENTILE RANK 
83 1 179 178.5 100 
82 1 178 177.5 99 
81 2 177 176 98 
80 1 175 174.5 97 
79 1 174 173.5 97 
78 6 173 170 95 
d 2 167 166 93 
76 7 165 161.5 90 
75 3 158 156.5 87 
74 9 155 150.5 84 
73 4 146 144 80 
72 6 142 139 78 
71 6 136 133 74 
70 16 130 122 68 
69 8 114 110 61 
68 16 106 98 55 
67 18 90 81 45 
66 22 72 61 3 
65 18 50 4l 23 
64 15 32 24.5 14 
63 9 17 12.5 7 
62 4 8 6 3 
61 4 4 2 1 


N = 179; 1/N = .005587; 1/2N = 002793. 

percentile ranks, it will be noted that a 
ts near the middle of the distribution 
a difference of the same number of 


In interpreting percentiles and 
difference of, say, ten percentile poin 


will involve fewer raw score points than 
percentile points at the top or bottom of the distribution. In the table of 


norms for the Minnesota paper form board test, it is seen that the difference 
between Р,о and Ps, is one raw-score point and between P5, and Р, (ії is 
three raw-score points, while the difference between P,, and P, is nine 


Taw-score points. 
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This results from the fact that with most psychological measures, scores 
tend to pile up near the center of the distribution and to become rarer as 
the scores deviate from the median. 


DISTRIBUTION OF A SET OF PERCENTILE RANKS 


In contrast with obtained scores, a distribution of percentile ranks (if 
based on an exceedingly large number of different raw scores) yields a 
“rectilinear” or flat distribution. This can be readily understood from the 
concept of deciles, which are sometimes used for norming tests when there 
is not much varability in raw scores. 

To convert scores into deciles, the following percentiles are computed: 
Poo, Рао, Рз, Pao, Ро, P40, P30, Ро and Р, о. Then the group with raw 
Scores above Pog are in the 10th decile group, the group with scores between 
Рво and Ps, are in the 9th decile group, and so on to the group with scores 
below Р, о, which are in the Ist decile group. It will be readily seen that, in 
theory at least, equal numbers (that is, 10 percent of N) will be assigned to 
each decile group. Thus, with identical f in each of the ten categories, the 
distribution of decile scores will be flat or “rectilinear.” The same basic 
principle applies to percentile ranks, which are finer subdivisions than are 
decile groups. Accordingly, a distribution of percentile ranks is theoretically 
a flat or rectilinear distribution. In practice, however, it can be truly 
rectilinear only when both N and the range are large. The principle, how- 
ever, must be kept in mind when percentiles or percentile ranks are used in 
interpretations of psychological test scores. 


RANK CORRELATION 
SPEARMAN'S p 


Just as the contingency coefficient is a means of describing association be- 
tween two nominal variables, so rank correlation is a method for describing 
how two ordinal variables tend to vary together. There are two important 
methods for rank correlation, Spearman's rho (the symbol! for which is 
р) and Kendall’s tau (the symbol for which is т). 

А The computation of p, which varies from —1.00 through .00 to +1.00, is 
illustrated in Examples 4.4 and 4.5. If p = —1.00, the highest rank in one 
variable is associated with the lowest rank in the second variable, the next 
highest rank in the first variable is associated with the next to the lowest 
rank in the second variable, and so on. This would be perfect inverse 


relationship: the higher the standing in one variable, the lower in the other. 
tii vid i eee 
1 x n 222. 

Іп pd to follow the convention of Roman letters for descriptive statistics, Greek 
letters for Parameters, some authors have discarded p as the symbol for rank correlation 


in the sample. However, many psychologists continue to follow Spearman's original 
usage of p and to use 7 for Kendall's coefficient. 


DESCRIPTION BY RANKING 91 


EXAMPLE 4.4 


COMPUTATION OF SPEARMAN'S e TO MEASURE RANK CORRELATION 


RANK SCORE RANK 


INDIVIDUAL NX IN Y IN Y D D? 

A 1 25 3 2 4 

B 2 30 15 5 25 

с 3 30 1.5 1.5 2.25 

D 4 15 5 1 1 

E 5 15 5 0 0 

F 6 15 5 1 1 

G 7 10 7 0 0 
XD? = 8.50 


must be expressed in ranks. In these 


hypothetical data, variable X consists of ranks; variable Y is in score form, but 
is converted to ranks. Any scores that are tied are assigned the midrank, that is, 
the different ranks required for the ties are summed and divided by their number. 
Thus, the two values of 30 are tied for first place. The corresponding ranks, 1 
and 2, are summed and divided by 2, yielding a rank of 1.5 each. The next score 
is 25, with a rank of 3. Then three scores, all with a value of 15, are tied. The 
Corresponding ranks are 4, 5, and 6, which when summed and divided by 3 
yield a midrank of 5. The final rank is 7. 
Strictly speaking, p does not apply when th 
method is generally accepted for treating ties. 
in psychology and education when there are only a few ca 


of the relationship is wanted. 

In the column headed D appears the difference between the rank in X and the 
rank in Y without regard to sign. These differences are squared and entered in 
the column headed D?. 

А In rank correlation it is often necessary to sq 
Simple rule, which can be verified algebraically 
multiply the number, Х, by (X + 1) and add .25. Thus, 
6.25; and (8.5)? is 72.25. 

The sum of the D? is 8.50. The comput 


To compute p, paired observations 


ere are ties, but the midrank 
Rank correlation is often used 
ses and a general idea 


uare numbers ending in .5. A 
by expanding CX + .5)?, is to 
(1.5)? is 2.25; Q.5)? is 


ation of p follows: 


__в6ур® у 16х85) 85.1. 15=.85 
MW? — 1) 7 x48 56 


_ This shows a substantial relationship between the two variables. However, 
Since p is based on only seven cases, it may not truly represent the relationship 
in the population from which the sample has been drawn. The evaluation of 
obtained p in terms of what is to be expected in a series of samples is reserved 


to a later chapter. 
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EXAMPLE 4.5 


CORRELATION BETWEEN TWO SETS OF RANKS 


RANK IN RANK IN 
FATHERS" MOTHERS" 
CATEGORY REPORTS REPORTS D D* 
Companionship 1 2 1 1 
Personality characteristics 2 1 1 1 
Intellectual abilities 3 3 0 0 
Fact of having 4 13 9 81 
Child rearer 5 5 0 0 
Endearing mannerisms 6 15 9 81 
Relatives' relations with child 7 10 3 9 
Motor ability, coordination 8 12 4 16 
Growth, developmental progress 9.5 14 4.5 20.25 
Artistic abilities, interest 9.5 11 1.5 2.25 
Relationships with siblings 11 9 2 4 
Social relationships 12 4 8 64 
School progress 13.5 6 7:5 56.25 
Routines 13.5 7 6.5 42.25 
Interests, hobbies 15 8 7 49 
N=15. XD? = 427 
mE 6 х 427 "ФИРМИ? 
N(N? — 1) 15 x 14 x 16 80 


A comparison of the order in which categories of “satisfactions” in child 
Tearing are ranked in fathers’ and mothers’ reports is made in the preceding 
table by means of Spearman's p. Data are modified from Tasch (4). Categories 
are taken from a list of 35, but only categories rated in the top ten by fathers or 
mothers are included. Ranks are based on reports of 544 mothers and 85 fathers. 
The coefficient of .24 reflects a low, positive relationship between fathers’ and 
mothers’ reported satisfactions in child rearing. 


Perfect positive relationship, in which the two sets of ranks are identical, 
results in a p of +1.00. If there is no association between the two variables, 


So that the rank in one gives no indication of the rank in the other, p 
is .00. 


The formula for p is 
6xp? 
N(N? — 1) 


in which N is the number of ranked cases and X D? is the sum of the squares 
of the differences in ranks. 


s (4.5) 


KENDALL’S т (TAU) 


A second Coefficient used to describe the relationship between two sets of 
ranks is t (tau), proposed by Kendall (2). As does р, т varies from — 1.00, 
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indicating perfect inverse relationship, through .00, indicating no relation- 
ship, to + 1.00, indicating perfect relationship. At intermediate points, th 
correspondence between the two coefficients is only fair, with t luin 
smaller in absolute magnitude. The computation of c (for an bwin 


without ties) is shown in Example 4.6. 


EXAMPLE 4.6 


COMPUTATION OF KENDALL'S т 


When there are no ties, Kendall's 7 can be found as follows: 
1. One variable (in this case X) is arranged in order from 1 through N. Each 


ar in X continues, of course, to be associated with the corresponding rank 
2. For each rank in Y (denoted as У), the number of ranks is found that are 
below it in the same column and that are numerically greater than Уі. This 
gives the number of pair-to-pair comparisons involving Yi, which constitute 
agreements. In the example, these numbers are tabulated as “‘pair-to-pair agree- 
ments." The sum of them is then used in Formula 4.6. 
adi is to be noted that only ranks greater than У; and below Y: in the same 
Жу are counted. This ensures that each pair of paired ranks is examined 
ғ ee If ranks greater than Y: and above Y« were counted, the result would 
uis number of disagreements, which still could be used to find 7 but which 
require a different formula. 
Procedures involving tied ranks are given by Kendall (2). 
The computation of t for a case without ties follows: 


NUMBER OF 
CASE OR RANK RANK PAIR-TO-PAIR 
A 1 а 9 
В 2 1 10 
С 3 2 9 
D 4 5 7 
Е 5 4 7 
Р 6 7 5 
G 7 6 5 
H 8 12 0 
I 9 9 2 
J 10 8 2 
K 11 10 1 
L 12 11 0 
57 


The number of pair-to-pair agreements is 57, which is substituted in Formula 


4.6. Then 
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4 (number of pair-to-pair agreements) 
N(N — 1) 


T 


4x57 
рхи 


1 = 1.73 — 1.00 = .73 


The result indicates considerable agreement between the two variables. For 
the same data, р = .89. 


Spearman’s р has two advantages over т. In the first place, it belongs to 
the family of product-moment coefficients described in Chapters 6, 7, 8, 
and 9; and, in fact, if there are no ties, p is precisely the product-moment 
correlation between the two sets of ranks. In the second place, ties are 
easier to handle with p than with t, even though no theoretically perfect 
solution for tied ranks exists. 

With t, however, the solution for tied ranks is more satisfactory in that 
it is in accordance with the theory of the coefficient. Another advantage of 
т is that its sampling distribution? is well understood, even when there are 
ties in ranks, and hence it is more useful in making inferences about the 
population from a knowledge of a sample. 

The logic of т is readily understood. Consider the case of any two pairs 
of paired ranks in which the two values of variable X are two different 
numbers of the series 1 through N, inclusive, and the two values of Y are 
also two numbers from the same series. Then, in the two pairs, the order 


is the same (agreement) or different (disagreement), as illustrated in the 
following table: 


PAIR-TO-PAIR AGREEMENT PAIR-TO-PAIR DISAGREEMENT 
ЕА Ye SA 

RANK RANK RANK RANK 

IN X IN Y INX WY 

2 3 2 5 

5 4 5 4 


With N the number of ranks in each variable, the total number of pair- 
to-pair comparisons is the number of combinations of N things taken two 
at a time, or N(N — 1)/2. Yf all pair-to-pair comparisons represent agree- 
ments, then the relationship should be represented as +1.00; if all pair-to- 
pair comparisons are disagreements, then the relationship should be repre- 
sented as — 1.00. If agreements and disagreements are equally divided, 
then the relationships should be represented as .00. Kendall’s t is a co- 
efficient designed to measure the relationship between two sets of ranks by 


? If a descriptive statistic were соті 
equivalent samples, drawn at rand 
would be a “sampling distributio: 
are treated in Chapter 13. 


puted in each one of an indefinitely long series of 
om from the same unlimited population, the result 
n." Certain characteristics of sampling distributions 
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a ratio that meets these conditions. It is simply the number of pair-to-pair 
agreements less the number of pair-to-pair disagreements, with the result 
divided by the total number of pair-to-pair comparisons. Since numbers of 
agreements and disagreements add to N(N — 1)/2, t can be found from 
either. A convenient formula uses only the number of agreements: 


өне 4 (number of pair-to-pair agreements) 1 
NW -1) т а 


SUMMARY 


When description is by ranking rather than by assignment to defined 
categories, a second system of statistics becomes possible. All the statistics 
applicable to nominal data can be used with ordinal information, plus the 
descriptive statistics based on percentiles. Central tendency is measured by 
the median, and variability by the range and by Q. Although numerical 
values for these statistics are in terms of obtained values, the measuring 
Instrument is thought of as essentially a device for ranking the observations. 

Percentiles and percentile ranks are convenient means of norming tests, 
50 that standing of the same individual on different tests or of different 
individuals on the same test can be easily compared. 

Spearman’s p and Kendall’s t are descriptive statistics for showing the 
relationship between two sets of ranks. 


Rank correlation requires only ranks, 
ordinal measurement. Values from more advanced types of measurement 


can, of course, be converted into ranks and rank correlation can be applied. 

Rank correlation is a convenient descriptive statistic when № is small, 
and a quickly obtained measure of relationship is needed as an aid in 
planning further investigation. 


Rank correlation is also а роо 
relation to be discussed in later chapters. 


and hence it can be used with 


d introduction to other measures of cor- 


EXERCISES 


1. The following represent scores on à reading comprehension test: 


34, 34, 33, 32, 32, 31, 31, 31, 31, 30, 30, 30, 30, 29, 
29: 29, 29, 29, 28, 28, 28, 28, 28, 28, 27, 27, 27, 27, 
6, 26, 25, 25, 25, 25, 
24, 24, 24, 24, 24, 24, 24, 24, 
24, 24, 24, 24, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 
22. 22, 22, 22, 22, 22, 22, 21, 21, 21, 21, 21, 21, 21, 
21, 20, 20, 20, 20, 19, 18, 18, 18, 18, 18, 18, 17, 17, 
17, 17, 16, 14, 14, 14, 13, 13, 13, 13, 12, 12, 12, 11, 
111110, 9, 9, 9, 9, 8, 7, 7, 6 5 5. 
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Make a distribution of these scores, using a step interval of 2. Find the median, 
D, and Q. 


2. Using a step interval of 1 and the data of Exercise l, prepare a table of 
percentile ranks. 


3. Given the following distribution, find Pas, the median, and 0. 


STEPS "à 


42-44 
39-41 
36-38 
33-35 
30-32 
27-29 
24-26 
21-23 
18-20 
15-17 
12-14 


HwhUuaAcdrann 


4. For the following distribution, find percentile ranks corresponding to raw 
scores, 


x f x f 
103 3 87 72 
102 5 86 68 
101 8 85 58 
100 10 84 50 
99 15 83 43 
98 25 82 43 
97 23 81 33 
96 30 80 20 
95 36 79 15 
94 49 78 17 
93 58 T7 18 
92 A 76 10 
91 95 75 8 
90 132 74 
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5. For the following distribution of scores, prepare a table of percentile norms 
To do this, compute the following percentiles: Pos, Pos, Poo, Pao, P75, Рто 
Poo, Ро, Рі, Рзо, P2s, Pio, Ps, and Рі. Round each percentile to the пав 


integral score. 


STEP LIMITS f 
170-179 1 
160-169 0 
150-159 3 
140-149 П 
130-139 19 
120-129 24 
110-119 38 
100-109 49 
90-99 66 
80-89 86 
70-79 90 
60-69 52 
50-59 27 
40-49 18 
30-39 10 
20-29 4 
10-19 2 


6. In three sections of high school mathematics taught by different instructors, 
distributions of final grades were as follows: 


GRADE SECTION I SECTION II SECTION III 


A 5 10 0 
B 14 12 18 
с 8 10 10 
р 8 0 7 
Е 2 0 5 
TOTALS 37 32 40 


The principal, who was taking а course in statistics, decided to compute the 
median for each section by assigning numerical values as follows: A = 4; 
B=3; C=2; D=1; and F=0. He considered the step interval to be 1. 
As the partition value between A and B, he used 3.5. He used 2.5 as the 
partition value between B and C, 1.5 as the partitition value between C and 


D, and .5 as the partition value between D and F. He reported the medians 
laces of decimals. What were his 


at a teachers’ meeting, correct to two p. 
results? 
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7. In two practice heats, a squad of cross-country runners finished as follows: 


NAME 15Т HEAT 2ND HEAT 
Jack 1 1 
Roger 2 7 
Lon 3 3 
Sid 4 5 
Bob 5 2 
Doug 6 11 
Chuck 7 8 
Jud 8 6 
Frank 9 9 
Stu 10 4 
John 11 10 
Paul 12 13 
Stan 13 12 
Glenn 14 14 


Find p and 7. 


8. For the following data on two variables, compute p: 


INDIVIDUAL VARIABLE I VARIABLE II 


A 35 6 
B 25 12 
c 45 7 
D 45 9 
E 20 15 
F 35 12 
G 35 8 
H 30 11 
$ 50 8 
J 40 10 
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BY AVERAGING 


5 


ade by a rat in learning a maze, ages of children—in 
merical information of interest in psychology 
arized by averaging. An average, 
lues of all the numbers in a series, 
nd dependable summarization. 


Test scores, errors m 
fact, most of the varieties of nu 
and education—are commonly summ 
Which takes into account the actual va 
is generally а highly representative a 
This chapter is concerned with three ауега 
a measure of the central tendency, or the 


ges: 


1. The arithmetic mean, used as 
“center of gravity"; 

2. The variance, which is the mean of the squares of the differences 
between a set of values and thei 

3. The standard deviation, the m 
variability. It is the positive square ro 


r arithmetic mean; and 
ost useful of the direct measures of 


ot of the variance. 


INTERVAL SCALES AND RATIO SCALES 
nd the standard deviation are to be appropriate 
statistics, the variable must have values such that addition is a meaningful 
operation. The chief requirement is that the characteristic be measured 
along a scale, the units of which can be considered equal. This is clearly 
the case with months of chronological age, even though months do differ 


If the mean, the variance, à 
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as much as 10.7 percent in length. Since the different parts of a maze may 
usually be considered to be equally difficult, errors in a maze are also taken 
as additive. Whether a set of psychological or educational test scores 
(generally the number of items successfully completed, or the number of 
correct items less a proportion of the items answered incorrectly) can 
meaningfully be added is less easily decided. In some cases attempts are 
made to develop tests so that two conditions are met: 


1. All the items measure slightly different aspects of the same general 
trait or quality; and 

2. Equal score differences at different positions in the range represent 
equal differences in ability. 


While the second objective is seldom, if ever, attained, addition is a 
convenient tool in summarizing scores on educational and psychological 
tests. Accordingly, such tests are customarily treated as if they constituted 
interval scales, in which measurement is in units that are equal but which 
lack a true zero point. 

In physics, the Fahrenheit and centigrade thermometers constitute 
interval scales. In neither case, however, does zero represent absence of 
heat, since 0° is an arbitrary reference point. Accordingly, one cannot say 
that a day on which the temperature is 90° F is twice as hot as one on 
Which the temperature is 45° F, However, as in the case with all interval 
Scales, degrees of temperature are additive. If on three separate occasions 
the temperature is observed to be 68°, 75°, and 70°, it is permissible to add 
the three readings, to divide by 3, and to take the mean, 71°, as representa- 
tive of the three Separate figures. 


REQUIREMENT OF A TRUE ZERO FOR A RATIO SCALE 


A ratio scale meets an additional requirement. It has a true zero point, 
Tepresenting complete absence of the characteristic. Physical measurements 
of time, distance, and weight are on ratio scales, so-called because ratios 
and Percentages are meaningful. Measurements of weight can reveal 
whether опе boy is twice as heavy as another; but, since measures of 
intelligence, reading comprehension, and the like are not on ratio scales, 
there is no way of knowing that one person is twice as intelligent as another 
or has twice as much reading comprehension. This is because measure- 
ments оп psychological and educational tests have no true zero point. To 
illustrate, a student might take an examination in mathematics far beyond 
his training. If no questions are answered correctly, his obtained score 
would be zero. However, such a score would not necessarily indicate 
complete ignorance of mathematics. Rather it might mean that the items 
are too difficult for him and he is unmeasured. 
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Measuring devices in psychology and the social sciences certainly yield 
ranks along defined dimensions and thus are at least ordinal scales. In 
most cases they appear also to qualify reasonably well as interval scales, 
permitting summarization through addition. This is fortunate, since 
addition is a versatile basis for descriptive and inferential statistics. 


THE ARITHMETIC MEAN AND ITS PROPERTIES 


The arithmetic mean, found by adding all the values in a series and 
dividing by the number of cases, is perhaps the only important statistic that 
is universally familiar. In statistical notation the formula is 


IX 
X=M,== (5.1) 


or, with better identification of the summing operation, 


ЕА LX 
X2M,-—) Xi (5.1a) 

Né 
This formula, which requires no derivation, defines the mean by indi- 


cating the operations that are performed in order to arrive at it. The 


variable, denoted as X, is summed over all N cases. The sum is divided 
ithin a sample. 


by М, considered a constant because it does not vary WI 
The quotient is the mean. This formula describes precisely the way the 
mean is found from observed values or “raw scores." A computing 
formula, for the specific purpose of finding the mean from a frequency 


distribution, is given in connection with Example 5.3. 


EXAMPLE 5.1 


Mz, Vz, AND sz FROM ORIGINAL VALUES OF X 


riate to find the mean, the variance, 
1 values and their squares. Appro- 
d as Formulas 5.1, 5.3a, 


With an М of any size, it is always approp 
and the standard deviation by using origina k 
priate procedures, discussed in the text, are summarize: 
5.3b, 5.6, 5.8, and 5.8a. 

Because of greater accu 
are preferable with small 


distribution, as described in Examples 5.3 and 5.4. А 
When N is large, there is little difference in precision between raw score and 


frequency distribution methods. Original values and their squares can be handled 
with speed and accuracy on desk calculators, the one disadvantage being that 
a frequency distribution is not available for visual inspection. Punch card 
machines and electronic computers generally use raw-score methods in proces- 
sing large masses of data. If desired, frequency distributions can sometimes be 
made more or less automatically as by-products of the computations. 


racy and ease of computation, raw score techniques 
N to methods involving coding within a frequency 
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With computing machinery, individual squares are seldom recorded. With a 
desk calculator, each Ж is entered in the keyboard and multiplied by itself. The 
accumulation of multipliers is У X; and of products, X Х?, Some machine routines 
find Z X? through special summation techniques that do not involve individual 
values of X?. In the numerical example the X? values are written out only to 
clarify the general method. 

The scores in the table represent the number of errors made on an objective 
test in psychological statistics. 


INDIVIDUAL X xe INDIVIDUAL X Ж% 
А.А. 29 841 Р.Е. 25 625 
С.Ј.В. 16 256 B.R.S. 21 441 
CB, 31 961 S.L.K. 40 1600 
A.T.B. 37 1369 E.N.P. 25 625 
J.E.B. 35 1225 J.A.P. 51 2601 
ЕР:С. 17 289 І.б.5. 15 225 
NLC 34 1156 T.D.S. 39 1521 
C.S.C. 28 784 J.S.T. 28 784 
R.A.C. 22 484 W.L.W. 33 1089 
R.C.C. 12 144 J.A.Z. 36 1296 
О.Ј.Е. 21 441 


EX 595 
М» =— = — = 28.33 
N 21 
By Formula 5.3a, 
xx? 
Vz ==” = М? = A — (28.33)? — 893.19 — 802.59 — 90.60 


By Formula 5.3b, 


мх (Ух) (21 x 18,757) — (595)? 
N? (21)? 


Vz 


By Formula 5.6, 
sz = 4J Vz = /90.42 = 9.51 


The variance as computed by Formula 5.3a differs slightly from that computed 
by its exact algebraic equivalent, Formula 5.3b. In using Formula 5.3a, the mean 


has been rounded to two places of decimals and then squared. As computed 
the square of the mean is 802.59 i 


taken as 802.77, results by the tw 

The reason why Formula 5.3b 
occurs only at the end. It is a 
long as possible. 


> 


nstead of the correct figure of 802.77. If M2 is 
o formulas are identical. 

gives somewhat better results is that rounding 
good calculating principle to postpone division as 
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Of course, as in generally true, the discrepancy in the results has no practical 
importance. Often, however, computing errors are easier to detect when inter- 


mediate rounding error is avoided. 


The mean is a single number that in a general way represents a whole 
group of numbers. It can be regarded as a “measure of central tendency,” 
similar in intent to the median discussed in Chapter 4. If there are a few 
values in the series that are so extreme that they would change the mean 
markedly, or if definite values for some of the cases at the ends of the 
distribution cannot be ascertained, the median is preferred to the mean as a 
Statistic indicating the central point. In most instances, however, if it is 
permissible to add the scores, the mean is definitely preferable to the 
median. 

The mean has the following characteristics: | 

1. Every value in the series for which a mean is computed affects it. 

2. It is a unique point in a set of numbers, namely, the point around 
which the deviations! sum of zero. In all cases the sum of the positive 
deviations from the mean equals the sum of the negative deviations. 

3. The mean is also the point around which the sum of the squares of the 
deviations is a minimum. . 

4. The mean is affected directly by any systematic arithmetical change 
in the variable. Thus, if all values in a variable are increased or decreased 
by a constant, the mean is increased or decreased by exactly that amount. 
Similarly, if all values are multiplied or divided by a constant, the mean is 
changed in precisely the same way. This principle is important in the con- 
version of variables to standard scores, as discussed later in this chapter. 


S INDICATORS OF VARIABILITY 


a deviation from M, (denoted by x) 
by X) minus the mean of all the 


DEVIATIONS FROM THE MEAN A 


As implied in the preceding discussion, 
is merely the original value (denoted 
values in the series. Thus, by definition, 


x2X-M, 
tained, a value in deviation form 


(5.2) 


Provided knowledge of the mean is re ‚а aa 
actually preserves all the information in the original score. By adding the 


mean to a deviation from the mean, the original value is restored. | 
Since most psychological and educational tests lack a true zero point, 

and since the obtained mean is determined by somewhat arbitrary factors 

such as the number and difficulty of the items that happen to compose the 


et of deviations from a constant by subtracting 
ant used as a reference point in forming 
unless otherwise specified, the term 


1 Any variable can be converted into а $ 
the constant from each of the values. The const 
deviations is generally the mean; consequently, 
deviation refers to a deviation from the mean. 
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test, it appears that translation of all scores to a series witha predetermined 
mean could be useful. If an entire series of original scores were translated 
into deviations, the sum of the deviations (and hence their mean) would be 
zero. This is easily demonstrated by summing all the terms in Formula 
5.2. While Formula 5.2 indicates only a single case, it applies generally to 
all values in the series. Conceptually, the process of summing involves 
adding together N of these equations, one for each case. The summing of 
variables is indicated by the summation sign (E) When a constant is 
summed (in this instance, М „), the constant is added N times, which is the 
same as multiplying it by М. Substituting ZY/N for М, and canceling N 
from the numerator and denominator of the fraction, it is seen that the 


sum of the deviations from the arithmetic mean is necessarily zero. In 
notation: 


Bx = EX —NM,=2x DU ЕХ -3X =0 


The most important function of the deviations is in indicating the 
degree to which the values in a series of numbers tend to vary. It is apparent 
that if all the deviations are zero, all original values are exactly alike; that 
if the deviations tend to be small, then the original values vary little one 
from another; while if the deviations are large, there is considerable 
variability among the original values. 

The magnitudes of the deviations, then, reflect the variability of the 
original values. Some method of summarizing these magnitudes is needed, 
but since the deviations necessarily sum to zero, their arithmetic mean is 
necessarily zero. In the early days of statistics the “average deviation," or 

mean deviation," was computed by summing the deviations as though 
all were positive and then dividing by the number of cases. Involving an 
Operation of questionable mathematical merit and not fitting into a 
generally accepted family of descriptive statistics, the “average deviation” 
or AD is seldom encountered in present-day research. 


THE VARIANCE AND THE STANDARD DEVIATION 


Two exceedingly useful statistics based on deviations are the variance and 
the standard deviation. The variance is the more important in theoretical 
Studies, since it lends itself to analysis into component parts, while the 


Standard deviation is useful in making scores from different sources 
comparable. 


FORMULAS FOR THE VARIANCE 


By definition, the variance is the mean of the squares of the deviations; that 
is, 


LS (5.3) 
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The formula? indicates that the variance, here denoted by Зу, 15 
found by squaring each deviation, summing these squares, and dividing 
by the number of cases. While Formula 5.3 defines the variance, it is 


awkward for actual computation because time is required to find the 


deviations and because deviations are usually decimal fractions. Accord- 


ingly, a procedure for finding the variance from the mean of the original 
values and the sum of their squares is preferable. | 
Squaring Formula 5.2, which defines any deviation x as equal to 
(X — М), yields 
x? = X? 2M,X + Mj (5.4) 


Summing all terms, 

£x? = EX? —2М„ЕХ + NM,’ (5.5) 
(In the middle term on the right-hand side of Formula 5.4, 2 and M, are 
constants and Y is a variable. The sum of such terms is the sum iue 
variable multiplied by the constants, so that the sum of 2M,X is 2M, ә 


АП terms of Formula 5.5 аге now divided by N: 


Ex? EX? EX NM 

aca ТК м 
Changing X Y/N to М, and canceling N in the final term, we have 

2 zx? 
Ex? ЕХ om +M2=5 — Me 
N N 
Accordingly, from Formula 5.3, 
QEX a (5.3a) 
ET x 


An alternate formula, that may be obtained by substituting EXIN 


for M, and by putting all terms over N?, is 
NZX?-(ZXY 
а 


(5.36) 


е variance is computed by dividing =x? by (N — 1) 


instead of by N. This procedure yields an estimate ofa parameter, жаласы пе 
ance іп the population. The parameter standard deviation 15 кеншек ыша 85 
VXXSN — 1). This chapter is concerned with variances and standard deviations of 


HERA : opriate. қ 
samples, where division by N is арргор d acceptable for the variance, although not 


3 The symbol V seems to be convenient an А « » 
widely used. Mathematical statisticians sometimes изе Var." Other symbols often 


encountered are s? and o?, which reflect the fact that the variance is the square of the 
standard deviation, symbols for which are 5 (in the sample) and o (the parameter or 


population value). 


2 In Chapter 13 it will be noted that th 
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The variance is difficult to interpret directly. It is not in the units of the 
original scale, and its absolute size is markedly affected by scale changes 
involving multiplication or division. Generally, it is merely an inter- 
mediate statistic used in finding another statistic of more direct interest. 
Analysis of a variance is always in terms of a ratio of variances, so that it is 
not necessary for the absolute magnitudes of variances to have signifi- 
cance. In correlation, as described in subsequent chapters, the variance, as 


a proportion of another variance, is a central tool in understanding re- 
lationships among variables. 


STANDARD DEVIATION AS AN AVERAGE 


The standard deviation is a measure of variability that can be directly 
interpreted. It is applicable to all scales that are in addable units. 


The standard deviation* can be defined as the positive square root of the 
variance; that is, 


5, — JV, (5.6) 


The procedure of taking the square root of the variance acts to restore 
the original scale of measurement so that the standard deviation, like the 
mean, can be directly interpreted in terms of original units. 

An alternate method of interpreting the standard deviation is to regard 
it as an average of the deviations from the arithmetic mean. 

One of the averages known to mathematics is the quadratic mean, or 
Toot mean square, Obtaining the quadratic mean is a convenient method 
of finding a measure of central tendency of a series in which some numbers 
are positive and some are negative, and for which an average of the 
magnitudes without regard to sign is of interest. Even though the arith- 
metic mean of the deviations is necessarily zero, their quadratic mean is an 
appropriate average, Each deviation is squared and the sum of the squared 
deviations is divided by N. The Square root of the quotient is then found. 
The Tesult, by definition, is the Standard deviation; that is, 

2 
ig ila (5.7) 
N 

Like the variance 
Obtained values th 
mula, it i 


the standard deviation is more readily computed from 
an from actual deviations. To obtain a raw-score for- 
5 necessary only to take the square root of both sides of Formula 


ea thn 
^ In addition to 5, 


symbols used for the standard deviation in the sample include SD, 
S, and c, which уу; i 


» including Karl Pearson, and which 


ncc red. Most statisticians, however, prefer to reserve с to indicate the 
standard deviation as a parameter, 
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5.3a. Accordingly, 


— 
5, = V. E -M? (5.8) 


An alternate formula is found by substituting 2X/N for M, in Formula 
5.8. Then, 


N N N? NY N 


FUNCTIONS OF THE STANDARD DEVIATION 


The standard deviation has four important functions: 

1. It is used in transforming obtained values of a variable into a new 
system of values with a predetermined standard deviation (and a pre- 
determined mean). By such transformations, scores on different variables 
are made comparable. This use is discussed under the topic of standard 


scores. | О 
2. The standard deviation is the measure of variability essential in 


correlation. All formulas for linear correlation include, in some form or 
Other, the standard deviations of the two variables correlated. The term 
correlation applies to a description of the relationship _between two 
variables after standard deviations have been equalized. This is explained 
in Chapter 6. f 

3. The standard deviation helps to define some of the theoretical 


frequency curves, especially the so-called normal curve discussed in 
, 


Chapter 11. | | | 
4. The standard deviation is used in evaluating differences between 


means. This application is presented in Chapter 13. 


STANDARD SCORES 
h it is appropriate to compute a mean and standard 


Any variable for whic! : 
een 5 into a new set of values with an 


deviation may be transformed linearly 


assigned mean and assigned standard deviation. | | 
For theoretical purposes, the most useful of these transformations is the 
the difference between any value and the 


2 score. To compute 2 scores, ай) 
: he standard deviation. In symbols, 


mean of the entire series is divided by t 


Ec (5.9) 
Sx 


2. = 


5 A“linear transformation” may be indicated as Y = aX + b, in which X is any original 
value, a and b are constants, and Y is the transformed value of X. Y — aX +b is an 


equation of a straight line in which a is the slope and 5 the intercept. 
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Obviously, since (X — M,) is the deviation x, this may be written as 
2,- - (5.9a) 


The z score indicates directly how many standard deviations the original 
score deviates from the mean. Thus, for z scores, the standard deviation 
itself becomes the unit of measurement. For scores above the mean, z 
Scores are positive. For scores below the mean, they are negative. 

When Formula 5.9a is summed, it is readily seen that Ez, = хув. 
However, since Xx = 0, z,, the sum of any set of z scores over the entire 
range of the variable is also precisely zero. 


If both sides of Formula 5.9a are squared, summed, and divided by N, 
it appears that 


Since the z scores are deviations from their mean, it follows that their 
variance is 1.00. Also, since the Square root of unity is unity, it follows that 
the standard deviation is also 1.00 and that V. and s, are equal. 

When actually used, z scores are generally reported to two places of 
decimals. For practical purposes they suffer from two disadvantages: 


1. They are usually three-digit numbers, which are awkward to handle; 
and 

2. They require an indication of their algebraic sign, since approximately 
half of them are negative and the other half Positive. 


Accordingly, standard scores for practical purposes, such as norming 
tests, generally follow a system of positive two-digit numbers. While any 
arbitrary mean greater than about three standard deviations can be used, 


one of the most popular systems is to assign a mean of 50 and a standard 
deviation of 10. 


У System of scores m 


ay be readily translated into standard scores 
(S.S.) by the use of the foll 


owing formula: 


S.S., = (5 = "s +M' (5.10) 
S. 


x 


in which s' is the assi 
mean. 

In this formula thi 
original values to z 


gned standard deviation and M’ is the assigned 


€ effect of the expression (X — M,)/s, is to reduce 
Scores with mean of zero and standard deviation of 
unity. When these z Scores are multiplied by s', the standard deviation will, 
of course, be s' instead of unity. The mean remains at zero. Now, by adding 


M' to each Score, the standard deviation remains unchanged as s', while the 
mean becomes M'. 
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To convert values into a series with predetermined mean of 50 and stan- 
dard deviation of 10, often called T scores,° Formula 5.10 becomes 


X—M. 
S.S., = (=) 10 + 50 (5.102) 


Sx 
For computing purposes, Formula 5.10 may be written as 
s'M, 

Sx 


D 


S.S., = x(=) +M'- 


Sx. 


(5.10b) 


or, in the specific instance of standard scores with an assigned /М” of 


50 and е” of 10, 
10M, 
(5.10c) 


S.S. = r-x(2) + 50 – 
Sx Sx 

Since (M' — 5° М, /з,) is a constant, it need be computed only once. 
Conversion of a variable Y into standard scores requires each X to be 
multiplied by the ratio of the arbitrary to the obtained standard deviation 
(s‘/s,). To this result, the constant (M’—s'M,/s,) must be added. А 
method of rapidly converting raw scores into standard scores by means 
of a calculating machine is illustrated in Example 5.2. 


EXAMPLE 5.2 


CONVERSIONS OF X TO STANDARD SCORE FORM 


The two systems of standard scores most frequently encountered are z scores 
(with M’ = 00 and s’ = 1.00) and T scores (with М” = 50 and s' = 10). A few 
examples of the conversion of X to z and Т, if Mz happens to be 30.00 and sz 


happens to be 8.00 are given in the table. 


x EQUIVALENT EQUIVALENT 
VALUE z SCORE T SCORE 
30 .00 50 
34 + .50 55 
38 +1.00 60 
46 +2.00 70 
26 — .50 45 
22 —1.00 40 
6 —3.00 20 


In the tabulation above only a few selected X values are shown. Only for an 
entire set of X would the means necessarily be .00 for z scores and 50 for T' scores 
and the standard deviations 1.00 and 10, respectively. 


6 As originally proposed b i i 
2 y McCall (2) Т scores were normalized scores with M^ 
and s' of 10 for a reference population of 12-year-olds. em 
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It will be readily seen that the z scores indicate the number of standard devia- 
tions the X score is above or below the Mz of 30. Thus, the X of 46 is 16 points, 
or 2.00s above 30, and the X of 26 is four points or .50s below 30 (and hence the 
z score is —.50). 

Similarly, all T scores may be seen to be just as many s above or below their 
М” of 50 as the X scores are above or below their mean of 30. Thus, a T score of 
60 is 10 units or ls above 50 and corresponds to an X score of 38, which is 8 
units, or Is, above 30. 

The following artificial data, arranged in order for convenience, illustrate the 
conversion of an entire series of values to z scores and to 7 scores: 


x Z РЫ Г T? 
130 2.00 4.00 70 4900 
115 1.50 2.25 65 4225 
115 1.50 2.25 65 4225 
100 1.00 1.00 60 3600 

94 80 64 58 3364 

91 70 49 57 3249 

88 60 .36 56 3136 

85 50 25 55 3025 

85 50 25 55 3025 

73 10 01 51 2601 

70 00 00 50 2500 

70 00 00 50 2500 

70 00 .00 50 2500 

67 —.10 01 49 2401 

55 —.50 25 45 2025 

55 —.50 25 45 2025 

52 —.60 36 44 1936 

49 —.70 49 43 1849 

46 —.80 .64 42 1764 

40 —1.00 1.00 40 1600 

25 —1.50 2.25 35 1225 

25 —1.50 2.25 35 1225 

10 —2.00 4.00 30 900 

UX = 1610 Xz—.00 Х2%-23.00  ET—1150 ET?— 59,800 
ZX? = 133,400 N=23 


Before conversion to any system of standard scores, M, and s, must be known. 
In this case, by Formula 5.1, 


By Formula 5.3b, 


y — 93 х 133,400) - (1610)? _ 476,100 
= (23)2 529 


By Formula 5.6, 
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Procedures in converting X to z follow: 
1. Since the г have М” = .00 and s’ = 1.00, Formula 5.10b is used as follows: 


Р “М. 1 M, 
S.S. = х(2) 4 САШЕ: НЕ x(2) M. 


т, т Sz, 


2. Accordingly, each X is divided by 5, (or multiplied by the reciprocal, 
I/s,) and the result is reduced by M,/s,. In the example, 1/s, — .0333 and 
М „15, = 2.3333. Accordingly, for X = 130, z = (130 x .0333) — 2.3333 = 2.00. 


All other z score equivalents can be found similarly. . 
3. A complete set of z scores can be checked by finding £z (which must be 


.00, within rounding error) and £z? (which must be N, also within rounding 
error). These are the conditions under which M, = .00 and У, = 5, = 1.00. 
In this instance the checks work. 
Steps in converting X to T are similar: 
1. With M’ = 50 and s’ = 10, Formula 5.10b becomes: 
2 УМ, 10 10M, 
ss.=x(=)+m'- = x(2)+90-—= 


Sz 5: т 


т 


2. The factor by which X is to be multiplied (57/54) is (10/30), or .3333. The 


ratio, s’/s,, is often called the rate. The constant by which X times the rate is 
to be increased is [50 — (10 x 70)/30] = 26.6667. To convert the Х of 130 toT, 
130 is multiplied by .3333, and 26.6667 is added to the product, yielding 70. 


3. The entire series of T is checked as follows: 


1150 
=——= 50 
TUS 
ёа (23 х 59,800) - (11507 | 52,900 _ 00 
a (23)? 529 


k calculator, so that an entire series of 


Formula 5.10b fits nicely into any des 
d to standard scores. Steps are: 


obtained values can be quickly converte 


1. Compute the rate (ғ//5.). 

2. Compute the constant (M' — s'Mls;). 

3. Enter the constant in the product dials, 
to those to be used with the rate. 


4. Enter the rate in the key board. қ "E 
5. The machine should now read 0 in the quotient dials, while in the product 


dials is the equivalent standard score. As X entries are built up in the quotient 
dials, the rate adds in the product dials and builds up corresponding standard 


Scores. 


with the number of decimals equal 


uld, of course, be read as complements. How- 
individual z scores are rarely, if ever, used. 
d other standard scores that are positive 


Note: Any negative scores WO 
ever, in the practice of psychology, 
The method works well with T scores an 
throughout the range. 
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THE FREQUENCY DISTRIBUTION IN STATISTICS BY AVERAGING 


A distribution used for computing the mean, the variance, the standard 
deviation, and other statistics obtained by summation or averaging is 
based on a convention somewhat different from that used in a distribution 
for computing percentiles and other “point measures." 

In computing percentiles, all values within the step are thought of as 
being evenly distributed throughout the step. In computing statistics by 
averaging, all scores within the step are treated as though they fall exactly 
at the midpoint. 

The same distribution may be used for computing percentile measures 
and then again for summation measures. However, in changing from 
Statistics obtained by ranking to statistics obtained by summation, the 
concept of how the frequencies are distributed within the steps must be 
altered. 

By regarding all the scores within a step as falling exactly at the mid- 
point, scores are in effect coded, thus considerably reducing the size of 
the numbers to be handled. 


CODING AS A COMPUTATIONAL AID 


It will be recalled that if a constant is added to or subtracted from all the 
Scores in a series, the mean is correspondingly increased or decreased. 
Normally a constant is subtracted in order to reduce the numerical size 
of scores. However, in some cases, a constant is added in order to eliminate 
negative scores. 

To find the mean for a series of scores that have been coded by adding 
or subtracting a constant, the coded scores are summed and divided by N 
in the usual fashion. Then comes a final step of adjusting the result by 
adding or subtracting the constant. For example, if 65 has been subtracted 
from all scores, then 65 must be added to the mean of the coded scores in 
order to correct it to the mean of the original scale. 

Adding or subtracting a constant to all scores has no effect upon the 
standard deviation. 

Scores may be coded also by multiplying or dividing by a constant. 
Such an operation affects both the mean and the standard deviation. 
After computations are made in the usual fashion, they are corrected by 
multiplication for scores coded by division, and by division for scores 
coded by multiplication. For example, if all scores have been coded by 
dividing them by 3, the obtained mean and standard deviation must be 
multiplied by 3 to find the mean and standard deviation of the original 
series. 

Both types of coding may be used simultaneously. For example, in 
computations from a frequency distribution, the usual procedure is to 
code both by subtraction and by division. The midpoint of a step interval 
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is taken as an arbitrary origin. In taking the origin at such a midpoint, 
we are in effect subtracting that value from all the scores. Also, all scores 
are handled as deviations in terms of step intervals from that midpoint, 
In so doing, the scores are, after the subtractive operation, divided by 
the step interval. Computations are then carried out in terms of coded 
scores taken as deviations in terms of step intervals from the assumed 
mean. These coded scores are, of course, much smaller than the original 
numbers, The arithmetical labor in handling them is much reduced over 
handling raw scores by hand methods. This principle is the basis of the 
methods of finding the mean and standard deviation illustrated in Examples 


5.3 and 54. 


EXAMPLE 


Mz, Vz, AND s; FROM A FREQUENCY DISTRIBUTION? 
(Conventional Method) 


Distribution of 200 Scores on an Aptitude Test 


x’ х? 
STEPS TALLIES f d (е, а) (ie, df) 
140—144 1 9 9 81 
135-139 | 2 8 16 128 
130-134 П 3 ii 21 147 
125-129 ЦИ 6 6 36 216 
120-124 [LH |l] 8 5 40 200 
115-119 ШИ ШТ T 12 4 48 192 
110-114 ИП iun 15 3 45 135 
105-409 ри ИТИИ 2 2 4 RÀ 
10-104 Ии ри Шиши 32 1: 9? и 
95-59 ИТПИ ИН pal 2 0 +278 
м G Ee ІШІ a -1 cH 0 
8-8 — [p LL 20 -2 40 e 
8-84 Un DH D | jg а -2 a 
75-79 Ши GH 10) = 9 190 
707я Wi T 3 Moo m 
65-69 П Фо Ы =21 154 
60-64 | 2 -7 -14 98 
55-59 0 —8 
50-54 i =й 28 М 
—234 2108 
N=200 Ух, = 278 — 234 = 44 
= 5 Ex’? = 2108 
М'=97 


7 Since two variables are involved іп the examples іп Chapter 6, the notation for fre- 
quencies and computations involving coded scores is made more explicit. Thus, М”, i, 
f, and d are written with a subscript to indicate the variable concerned; x’ becomes dzfz, 
and x’? becomes dz2fz, to show more clearly the operations in finding Хх” and Хх?, 


5.3 
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The Data. The 200 scores on the aptitude test are considered to be addable. 
The scores are distributed on steps that extend from .5 below the stated lower 
limit to .5 above the stated upper limit. Accordingly, the true dividing points or 
partition values for the first five steps are 49.5, 54.5, 59.5, 64.5, 69.5, and 74.5. 
In computing the mean, the variance and the standard deviation, each score is 
treated as though it were exactly equal to the midpoint of the step in which it is 
classified. . 

Coding System. By distributing the scores into steps or classes and establishing 
the midpoint of one of the steps as the assumed mean (M^), all the scores are 
automatically coded. Thus: 


1. Any score X is taken as the midpoint X’ of the step in which it falls. . 

2. In effect, the assumed mean M' is subtracted from X' and the difference is 
divided by i, the step interval. This yields x’, the number of step intervals the 
Score deviates from the assumed mean. Thus, for any case, 


X’— M’ 
i 
3. In practice, after the distribution has been made, coding is accomplished 


merely by writing the d's opposite the frequencies. Each d indicates the 


number of step intervals the scores in a class deviate from M’, the assumed 
mean. 


+ The midpoint of any class in the distribution can be chosen as M’, but if 
a class near the center of the distribution is chosen, the computations will 
involve smaller numbers. In addition, if M' is below the true mean, the 
Correction to М” to find M, will be positive. To avoid all negative quan- 
tities, М” is often the midpoint of the bottom step, as in Example 5.4. 


y= 


Identification of the Columns. For any single case, x’ and d are identical. 
However, the column headed d is merely the coding apparatus. What is needed 
is the sum of the d values for the N cases. This sum is denoted as £x’. The column 
headed x’ is, for each Step, the sum of the x’ for-all the cases in that step or class. 
It is found by multiplying f, the class frequency, by d, the deviation of the class 


in step intervals from the assumed mean. Products corresponding to negative 
4% are negative, 


The column headed х” 
all the cases in the step. 
by its Corresponding d? ( 
entry in the d column b 
first four entries arel x 
or, more readily, 9 x 9 

Computation of the 
is the mean of the x* 
as follows: 


also carries sums; that is, the sum of the x'? or d? for 
These entries may be computed by multiplying each f 
not indicated in the distribution) or by multiplying the 
y the corresponding entry in the x’ column. Thus the 
81 = 81,2 x 64 = 128,3 x 49 = 147, and 6 x 36 = 216; 
=81, 8 x 16 = 128, 7 x 21 = 147, and 6 x 36 = 216. 

Mean. The mean computed from a frequency distribution 
or scores coded as midpoints. It is found by Formula 5.13 


Хх 
М. = М" ы 
+i N 97+ 


Computation of the Variance. In 


finding the variance from a frequency distri- 
bution, each score X is again treate 


d as though it were Ж”, the value at the mid- 
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point of the step in which it falls. With computations in terms of x’ and х” 
Formula 5.14 yields 
NXx2— (Xx) (200 x 2108) — (44)? 
ФЕ е-е | ге? 
і | 5 | o| (200): | 262.29 
Computation of the Standard Deviation. The standard deviation is merely the 
square root of the variance. Thus 


з, = VV, = V26229 = 16.20 


Charlier's Check. Computations of Dx’ and Хх from a frequency distribution 
may be verified by Charlier’s check. This usually involves new computations with 
an assumed mean one step interval lower than the assumed mean originally 
selected. The effect is to use a series of deviations, each of which is 1 greater than 
the d originally used. They may be denoted as x”. The check equations may be 


т- 


derived as follows. Let 
"aw (5.11) 
Summing 5.11, 
Ex'—Xx HN nd 
Also, squaring 5.11, 
—— (5.11b) 
Summing 5.11b, 
(5.110) 


Ух" = Dx’? + 2Dx'+ N 


Charlier’s check can be applied to the example as follows: 


STEPS y d d' же am 
140-144 1 9 10 10 100 
135-189 2 8 9 18 162 
130-134 3 7 8 24 192 
125-129 6 6 7 42 294 
120-124 8 5 6 48 288 
115-119 12 4 5 60 200 
110-114 15 3 4 60 240 
105-109 20 2 3 60 180 
100-104 23 1 2 46 92 
95-99 26 0 1 26 26 
90-94 24 -l Жат. 
85-89 2 -2 -1 —20 20 
8-84 16 -3 -2 -% 62 
15-19 0 -4 -3 30 50 
704 7  — -Ш 112 
65-69 4 -6 -5 720 100 
60—64 2 —1 =6 —12 72 
55-59 0 -8 -7 
50-54 12-9 -8 as 6% 
—150 +2396 
Computations 
Ух" = Ex’ +N 
244 = 44 + 200 


Ex” = Dx? +25х +N 
2396 = 2108 + 88 + 200 


116 AN INTRODUCTION TO PSYCHOLOGICAL STATISTICS 


Since Хх” and Хх” as computed agree with Хх” and Хх? as found by the 
check equations, the computations are considered to be correct. While com- 
pensating errors might conceivably occur, their likelihood is exceedingly small. 


EXAMPLE 5.4 


Mz, Vz, AND sz BY TWO CALCULATING-MACHINE METHODS 


(Conventional Procedure) 


x xt 

STEPS Ж а d? (ie, df) (ie, 4%) 
160-169 1 12 144 12 144 
150-159 7 П 121 77 847 
140-149 40 10 100 400 4,000 
130-139 123 9 81 1,107 9,963 
120-129 117 8 64 936 7,488 
110-119 137 1 49 959 6,713 
100-109 96 6 36 576 3,456 
90-99 54 5 25 270 1,350 
80-89 33 4 16 132 528 
70-79 8 3 9 24 72 
60-69 1 2 4 2 4 
50-59 0 1 1 0 0 
40-49 1 0 0 0 0 

Хх” --4,495 

N=618 Хх? = 34,565 


The Data. The distribution represents scores of 618 patrolmen who took a 
Competitive merit pay examination. 

Procedure. Except that the arbitrary origin (M^) is the midpoint of the lowest 
Step, the procedure is the same as that illustrated in Example 5.3. Frequencies 
are multiplied by d’s (deviations from М” in step-interval units) to form the 
entries in the column headed x’. Each entry is the sum of the d’s for all cases 
tabulated in the step. Similarly, the f’s are multiplied with the d?’s to form 
the entries in the x’2 column, which are sums of the d? for cases tabulated in the 
Step. It is not necessary to write out the d? because х? entries can be found by 
multiplying each d by the entry in the x' column. 

Actually, a skillful calculating-machine operator can work directly from the 
frequencies, using them as multipliers of the successive natural numbers (1, 2, 
3...) to form Ex’ and of the squares of the successive natural numbers (1, 4, 
9...) to form Ух, 

Charlier's check, of course, applies. The procedure is to use the midpoint 
below M’ for the new arbitrary origin. 

Computations. By Formula 5.13; 


44.5 + 117.23 


Sy 10 x 4495 
M,=M’+i7~ — 
и TEN 618 
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By Formula 5.14, 


р. Ее = d 100 Б х 34,565 — d 
N? (618) 


115,614,500 
381,924 


= 302.7160 


By Formula 5.6, 
sz = V Vz = V302.7160 = 17.40 


" Cumulative Frequency Procedure. A. procedure for finding Хх” and Хх? from 
umulative frequencies was developed by DuBois (1). It is based on the algebraic 
Principle that when two series of numbers are multiplied together in pairs, the 


sum of the products is identical under two conditions: 


о The two series are unmodified; or 
. One series is cumulated and the other consists of differences between terms. 


The differences between successive squares are successive odd numbers, as 
shown in the table column headed m (for multiplicand). 

directly from the tallies, the fare not 
g with 1 in the step above M' are 
m of the multipliers, that is, the Cf 


m sa cumulative frequencies are found 
е 7 icem odd numbers beginnin 
is Xv. ied by the corresponding Cf. The sui 1 
КШ = of the products, that is, the mCf, is Xx'?. If a desk calculator is used, 
nae p iers and products are allowed to accumulate and the entries headed mCf 
ya ot be written out. It is to be noted that N appears as the final Cf, and that 
and Хх? are found in a single series of machine operations. Charlier's check 
applies, with the usual shift in arbitrary origin. 
B. s the arbitrary origin М” is the same in both sets of computations, the Хх” 
identical, as are the 2°. This procedure is illustrated in the table. 


STEPS f Cf m mCf 
160-169 1 1 23 23 
150-159 ri 8 21 168 

912 


140-149 40 48 
130-139 123. 171 


19 
17 
120-129 117 288 15 4,320 
110-119 137 425 13 5,525 
100-109 96 521 11 5,731 
90-99 54 575 9 5,175 
80-89 33 608 7 4,256 
70-79 8 616 5 3,080 
60-69 1 617 3 1,851 
50-59 0 617 1 617 
40-49 1 618 
N=618 


Xx = XCf = 4,495 
Xa? = XmCf = 34,565 
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ERROR INTRODUCED BY CODING 


A certain degree of error may be introduced by the coding process, since 
all the scores within a step are treated as falling exactly at the midpoint. 
When the number of cases is small, there may be considerable discrepancy 
between the mean and standard deviation, as computed by coding, when 
compared with the same statistics computed from raw scores. However, 
when the number of steps is reasonably large, say, ten or more, and the 
distribution is fairly regular and symmetrical, the error is negligible. A 
correction® exists to compensate for this “ grouping error" in computing 
the standard deviation, but the amount of correction is small and of little 
practical consequence. 


FORMULAS FOR M AND 5 BASED ON CODING 


The use of coding results in computations using deviations from an 
arbitrary point (the “assumed mean”) rather than deviations from the 
actual mean. If М” represents an assumed mean and x" is any deviation 
from it, then 


X=M'+x" 
Summing, 


УХ = ММ' + Хх” 
Dividing by N, 


УХ "ES ou 
jm М' + N М, 

In computations from а frequency distribution, an additional step is 
involved, namely, the coding of the deviations in step-interval units, 
denotes as x’, which is the nearest whole number resulting when x” (the 
deviation from the assumed mean) is divided by i (the step interval). In 
finding means and standard deviations from a frequency distribution, all 
Work is done in step-interval units and is translated into raw-score units 
at the end of the computations. 


It is approximately true and can be taken as true that 


Х= М' + іх (5.12) 
Summing, 


УХ = NM’ + ixx’ 
Dividing by N, 


тр = М = M' + — 5.13 
х + (5.13) 


8 This adjustment is known as Sheppard’. 


j 5 correction. Prior to extracting the root, the 
variance is reduced by 12/12. 
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This is a preferred formula for finding the mean from a frequency 
distribution. Its use is illustrated in Examples 5.3 and 5.4. 
For the standard deviation we first find a formula for the deviation x by 


the use of Formulas 5.12 and 5.13: 
x=X—M,=(M' + ix)- (м + X) = (= = 22 (5.132) 
Squaring both sides, 
EX Ex'^A2 
2... 42 72 — i ae 
ails 22 ә + (==) 


Summing and dividing by N, 


Р н ғ 12 ^2 
м, ара EE, Gy] Е EY] e 
N x N N2 N N N 
Extracting square roots and identifying JV, as the standard deviation, 
керсе =~) (5.15) 
x N N N 


or 
s = ЛЕ" — xy (5.152) 
"ON 
Formula 5.15 is sometimes preferred for computations by hand, while 
Formula 5.15a fits better into a desk calculator. 


PROPERTIES OF THE STANDARD DEVIATION 


Properties of the standard deviation may be summarized as follows: 


1. It is a number, in original scale units, that represents the variability of a 


series. 

2. Mathematically, it is an average (the “quadratic mean” or “root mean 
square”) of the deviations from the arithmetic mean. ДЕ 

3. Since it takes into account the magnitudes of all values in the series, it is 
a more stable measure of variability than those based upon points in 
the distribution, such as the range and Q. " | 

4. It is closely related to other descriptive statistics that summarize data 
through summation. The mean helps define the standard deviation, 
which is the square root of the variance; and, as will be seen later, the 
standard deviation helps to define the correlation coefficient and the 


normal distribution curve. 
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wish to know the mean score for the test as a whole. Designating the 
several variables as X,, X; >>. X, and their means as M,, М, ··· M,, and 


designating the total or sum variable as X,. and its mean as M,, it can be 
stated that 


Х+Х, +e +X,=X, 
Summing, 


EX, +ХХ, +: + УХ, = УХ, 
Dividing by №, 


M,+M,+--+M,=M, (5.18) 


Accordingly, the mean of a sum variable is merely the sum of the 
means of its constituent variables. 


Procedures for finding the variance and the standard deviation of a sum 
variable are discussed in Chapter 9. 


SUMMARY 


Many of the varieties of numerical information used in psychology are 
summarized by averaging. The arithmetic mean, which is the sum of all 
the values in a series divided by the number of cases, is the most frequently 
used measure of central tendency. The variance, defined as the mean of the 
Squares of the deviations from the arithmetic mean, is a measure of 
variability, which can sometimes by analyzed into component parts. Its 
Square root is known as the standard deviation, the most useful of the 
direct measures of variability. 

Obtained values can be transformed linearly into a new set of values 
with assigned mean and standard deviation. For theoretical purposes, the 
most useful of these transformations is the z score with mean of .00 and 
variance of 1.00. For practical purposes, two-digit standard scores with 
no negative values are more convenient. Of these, the most common is the 
T score, with assigned mean of 50 and assigned standard deviation of 10. 

This chapter is concerned in part with computing methods, including the 
finding of statistical constants from obtained values and scores coded by 
procedures applicable to frequency distributions. 


EXERCISES 


1. Using a step interval of 5, make a distribution 


of the following scores from a 
chemistry placement test: 
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Find M,, Vz, and Sz- 


find Хх” and Хх? from апу arbitrary origin 


- For the following distribution, 
ts. Find M,, Va and sz. 


and then apply Charlier's check to verify the resul 


steps f sters — f 
42-43 1 28-29 12 
40-41 3 26-21 5 
38-39 7 2425 3 
36-37 1 22-23 0 
3435 18 2021 2 
32-33 20 18-19 2 
30-31 16 


find Sx’ and Xx’? by the conventional method 
d the successive odd numbers. Use 
bitrary origin. Compute M;, Fs 


- For the following distribution, 
and also from cumulative frequencies an 
the midpoint of the lowest step as the ar 


and 84. 

oe 
42-44 3 
39-41 3 
36-38 6 
33-35 8 
30-32 10 
27-29 15 
24-26 12 
21-23 7 
18-20 4 
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4. If the mean of a set of values is 18.2 and the standard deviation is 3.9, com- 
pute the rate (s’/s,) and the additive constant (M' — s’M,/s,) to convert the 
original values to standard scores with a mean of 50 and standard deviation 
of 10. What standard score corresponds to an obtained value of 26? 


5. In a sample of 200 cases, the following M and s were found for three parts 
of a test: 


M 5 
Part I 40.8 10.3 
Part II 30.7 8.2 
Part III 29.5 9.5 


If the total score is found by summing the three part scores, what is the total 
mean? Is enough information available to find the total standard deviation ? 


6. Find the mean and standard deviation of the following scores (words recalled) 
on a memory test consisting of 25 words: 22, 22, 18, 18, 17, 17, 16, 14, 13, 
20, 19, 18, 18, 17, 17, 16, 15, 15. 


7. Convert the following set of raw scores to z scores and also to T scores: 
37, 71, 49, 88, 21, 64, 55, 19, 42, 54. 


- Group 1, consisting of 30 women, has a mean weight of 122 pounds and 
standard deviation of 8 pounds. Group 2, consisting of 40 men, has a mean 
weight of 155 pounds and standard deviation of 12 pounds. Find the mean and 
Standard deviation of the combined group. 
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LINEAR CORRELATION 
AND REGRESSION: 
TWO VARIABLES 


6 


THE NATURE OF CORRELATION 
s of relationship between variables. Examples 


Science comprises statement 
ture and the state of a substance, 


include: the association between tempera 
such as water or a metal; and the relationship between pressure and the 


volume of a gas, temperature remaining constant. A relationship in 
Physics can often be stated as a mathematical function with minimal error, 
but in constructing a science of the behavior of human beings, variables are 
difficult to identify; the number of variables operant simultaneously is 
large; and relationships between pairs of variables are far from perfect. 

| An important statistic for describing relationships in the social sciences 
is product-moment! correlation, which involves fitting a straight line to a 
two-dimensional plot of the observations in a manner such that the best 


Possible fit is obtained. 


1 Karl Pearson, who developed the formula in common use, taught applied mathematics 
at University College, London. He referred to certain functions of deviations as moments, 
a term taken from mechanics. His formula involves finding the average of the products 


of pairs of deviations, modified so that the standard deviations of the two variables are 
equal. As applied to two observed variables, the formula yields "zero order r.” 


125 
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REGRESSION AND CORRELATION 


1. The relationship between two variables is expressed as the equation 
of a straight line, the regression? equation. 

2. The degree of correlation is unaffected by linear conversion of one 
of the variables, or both, entering into the correlation. However, numerical 
values of constants in the regression equation used for predicting values in 
one variable from a knowledge of values in the other are based on: (1) the 
degree of relationship; and (2) the means and standard deviations of the 
two variables. 

3. When the variables have been modified so that their standard 
deviations are equal, the slope of the line indicates the degree of relation- 
ship. 

4. Without implying that one variable is the cause of the other, we may, 
for convenience, arbitrarily call one the independent variable and the 
other, the dependent variable. In the sample in which the correlation is 
computed, each value of the dependent variable may be divided uniquely 
into two uncorrelated portions: a part perfectly correlated with the 
independent variable and a part uncorrelated with it. 

5. In using the regression equation to predict values of the dependent 
variable, or criterion, in cases for which values of only the independent 
variable are available, the proportion of the variance of the dependent 
variable that can be predicted can, under certain assumptions, be stated 
precisely. 

6. In general, correlation requires paired sets of measurements on scales 
amenable to addition. Most frequently, in education and psychology, the 
measurements that are paired are of the same individual, and the corre- 
lation is a summary of the relationship existing in a sample of N individuals. 
It is impossible to find a correlation from a single pair of observations or 


between a variable observed in one sample and a different variable ob- 
served in a second sample. 


LINEAR AND NONLINEAR RELATIONSHIPS 


There is nothing in psychological and educational measurement that makes 
fitting straight lines better than fitting curved lines. It is conceivable that 
many relationships between psychological variables may ultimately turn 


2 Sir Francis Galton, who first described correlation, introduced the term regression. 
In applying the principle of correlation to characteristics of children paired with parents, 
he noted that children tend to regress toward the mean. For example, while the children 
of tall parents tend to be above average in height, they are closer to the mean of all 
children than their parents are to the parents’ mean. Similarly, children of short parents 
tend to be below average in height, but again closer to the mean of all children in the 
group than their parents are to the parents’ mean. This phenomenon always appears 
when two variables are linearly related and when the relationship is less than perfect. 
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out to be curvilinear rather than linear. Up to the present time, however. 
a те регһарв the only variable showing consistent db telatiðñs 
: р with psychological variables. Excluding relationships with age, most 

orrelations so far found appear to be linear. Because of this fact, linear 
Correlation is used in psychological investigations almost to the seclusion 


of curvilinear correlation. 


GRAPHING THE RELATIONSHIP BETWEEN TWO VARIABLES 


Consider the following five pairs of values, Xo and X,, and their z score 


equivalents: 
CASE Xo 20 Жі 21 
А 8 —.50 1 —1.50 
B 2 —1.50 3 —.50 
С 14 +.50 4 .00 
D 11 .00 5 +.50 
E 20 +1.50 7 +1.50 


Cartesian coordinates in Fig. 6.1. The de- 
al axis, or ordinate; the 


or abscissa. Each of the 
letters) represents 
as shown by its 


These data are plotted on 
p variable, Zo, is represented on the vertic 
s cere variable, zı, on the horizontal axis, 
i Le ted points (denoted as five circles identified by 1 

o z-score values for a single case: a value in z 


1' 15 THE LINE OF BEST FIT 
HE OBSERVATIONS IN zo. SINCE 
INE.) 


FIG. 

connie A REGRESSION LINE. (THE LINE 1. 

Bes. CTING OBSERVATIONS IN z WITH ТІ 
RES ARE PLOTTED, r IS THE SLOPE OF THE Li 
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projection on the horizontal axis; and a value in 2, as shown by its pro- 
jection on the vertical axis. 

Sometimes the plot of points indicates no relationship at all between the 
two variables. Under such circumstances the points tend to fall in a circular 
area. In Fig. 6.1 they fall approximately in an ellipse, indicating some degree 
of relationship. Since low values in z, tend to be associated with low 
values in zo, and vice versa, the correlation is positive. Had low values in 
one variable been associated with high values in the other (as indicated 
by points running from upper left to lower right), the association would 
have been negative. If all points fall exactly on a straight line, then know- 
ledge of a value in one variable leads to precise knowledge of the corres- 
ponding value in the other variable. In such instances, correlation is 
perfect. 

In psychological research, high correlations appear chiefly when a group 
of individuals is measured twice with the same instrument or with two 
equivalent measuring devices. Correlations between most psychological 
variables tend to be moderate or low. 


r AS THE SLOPE OF THE LINE OF BEST FIT 


A coefficient of correlation, denoted as r, is the slope of a straight line of 
best fit when two variables have been modified (if necessary) so that their 
standard deviations are equal, and when pairs of scores have been plotted 
as single points on Cartesian coordinates, as in Fig. 6.1. 

A generally accepted mathematical convention, known as “least 
Squares,” states that the line of best fit can be defined as the line 
around which the sum of the squares of the errors in fitting (that is to say, 
the “misses,” or residuals) is at a minimum. 

In Fig. 6.1, consider the line LL’. This is the line used to predict zo scores 
(values of the dependent variable or criterion) from z, scores (values of the 
independent variable). Denoting as Z, the score in zg predicted from z,, we 
wish to establish the slope of this line such that the sum of the squares 
of the residuals, X(z; — 2,)2, be as small as possible. When plotted on 
Cartesian coordinates, a line is thought to consist of points, each having 
two values, one of which is its projection on the y axis and the other of 
which is its projection on the x axis. The slope of a straight line passing 
through the origin is any nonzero y value divided by the corresponding 
x value. 

All the 20 values of points on the line LL’ аге 20/8 as obtained from the 
equation Žo = Bz,, in which В, as the slope of the line, can take any value 
between — 1.00 and +1.00. The problem is to find the value of В such that 
the sum of the squares of the differences between the observed values Zo 
and the values of Z, (on the line LL’) is as small as possible. 
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A rigorous derivation? of the formula for r, the coefficient of correlation, 
consists of finding the value of f under the stated condition. 


DERIVATION OF Г 
In any particular case the difference between zo and 2 can be written as 
Zo — Žo = Zo — B21 
Squaring both sides, and summing and dividing by N, 
5 ү pa 5 xa 
E(Zo — Žo) 5 Ezo — әр Xzoz, +? 21 


N N N N 


Since the variances of z scores are unity, 1.00 can be substituted for 


Ezj?|N and £z,?/N: 


Хо-4 , 5422053, g 
N gx mw p 


On the right-hand side of the equation, (Ezoz,/N)! can be subtracted 


and added as follows: 


X(z, — 29) {= (а) " (ey m шол ү p 
N g N N N 


The last three terms now form a perfect square and can be factored: 


7.7 2 
X(zg — Zo) e za) á Е= = 6) 
N N N 


It is now seen: 


1. If Z(z, — 25)?/N is as small as possible, then X(zo — Žo)? will also be a 


ead 2 ion on the 
2. This condition will obtain if [(£z021/N) — 17, the last expressi 
right-hand side of the equation, equals zero. ay 
775860, 


В]? will equal zero and X(Zo 


3. If B= 2, Ezaz/N) ~= 2 
ғымы базы се! the required slope is Ezoz/N. 


Will be as small as possible. Accordingly, 
mean of the products of the z scores in 


When the slope of the line is the : 
ie it can be denoted as 701, the coefficient 


the two variables (that is, £2921/N), it : 
of correlation, or with slight change in notation, as 


ee (6.1) 
xy N 
RE A moe Ee 
3 The usual derivation involves differentiating у = 1 — 2B(Ezoz1/N) + B? with respect 
to В, and setting the first derivative equal to zero. The exposition in the text accomplishes 
the same result by what is mathematically the identical procedure, but which is in an 


algebraic idiom. See Treloar (4). 
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in which x is any variable and y is any other variable for which paired 
measurements exist. This coefficient varies from —1.00 (indicating perfect 
inverse relationship) through .00 (indicating no relationship) to +1.00 
(indicating perfect, positive relationship). 

The derivation of r given above assumes nothing about the shape of 
the true line of best fit. The true line of best fit might be a parabola or 
some other curve. What is actually fitted is a straight line. If the regression 
is linear, then the line fitted by the correlation technique would be the 
line of best fit. If the regression is not linear, then the straight line with slope 
of r is not the best possible function connecting the two variables. In that 
case, the coefficient r does not provide an adequate description of the 
relationship. 

To find r for the data of Fig. 6.1, the mean z-score product can be 
found as follows: 


CASE Zo 21 2021 
А —.50 = 1,50) 275. 
В — 1.50 —.50 5. 
С +.50 .00 .00 
D .00 3.50 .00 
Е +1.50 +1.50 225 

Xzozi = 3.75 

By Formula 6.1, 
2202 3.75 
ro = 2201 =—— = 75 


N 5 


In Fig. 6.1, the line LL’ has been drawn so that its slope is .75; that is, 
for every point on the line (except at the origin), the Z) value divided by 
the Corresponding z, value is .75. In making predictions of Zo from z,, the 
vertical distance between the z, point on the horizontal axis and the 
Corresponding point on the regression line, or line of best fit, is the value 
of the predicted score, Zo. 

It will be noted that in this particular instance, none of the points 
representing pairs of original observations falls precisely on the line LL’. 
The deviations, or (zo — Zo) distances, are shown as vertical heavy lines 
connecting the points with the regression line. It is the sum of the squares 
of these distances that has been minimized. From a line plotted with slope 


other than r, the sum of the Squares of the residuals would necessarily be 
greater. 


THE TWO-VARIABLE REGRESSION EQUATION 


In product-moment correlation, then, a straight line is, in effect, fitted to 
plotted pairs of observations, although with conventional computing 
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acm points are seldom plotted and the line itself is seldom explicit. 
"i € line passes near the points so that the residuals or discrepancies in 

ing the line are small, the correlation is high; if the points tend to li 
away from the line, the correlation is low. The correlation ае пине ч 
pure number that is independent of the units in which the two часава 
аге measured, is both the slope of the line of best fit and ап indication 
of the goodness of fit. 

If X, and Y, are known to be positively related, one might suppose that 
the 2, value would be the best estimate of the corresponding value in 20. 
While the derivation given above of the formula for r shows that this is not 
the case, a numerical illustration may be of interest. In the tables below. 
for the data used for Fig. 6.1, are 20 and zi values, their differences, the 
differences squared, the Zp values from the regression equation (Zo = 70121), 
the residuals around the regression line (Zo — Zo), and the residuals squared. 


DISCREPANCIES BETWEEN OBSERVED VALUES 
(zo— 21) (о- 21)° 


CASE 20 21 

А —.50 —1.50 1.00 1.00 
B —1.50 —.50 —1.00 1.00 
[o +.50 .00 .50 25 
D .00 4.50 —.50 25 
E +1.50 +1.50 00 .00 


X(zo — 21)? = 2.50 


DISCREPANCIES BETWEEN OBSERVED AND PREDICTED VALUES 


Zo, i.€., 


(zo — 2o) (zo — Žo)? 


CASE Zo T0121 
тыз MEE LL 
A —.50 —1.125 .625 .391 
B —1.50 —.375 —1.125 1.266 
c 4.50 .000 .500 .250 
D .00 375 —.375 .141 
E +1.50 +1.125 +.375 41 


X(zo — Zo)? = 2.189 


It will be seen that the sum of the squares of the residuals around the 
regression line is less than the sum of the squares of the differences be- 
tween pairs of observed values. This is necessarily true in all situations in 


Which r is less than 1.00. 
RAW-SCORE REGRESSION EQUATION 


The regression equation in z form is not particularly useful for making 
predictions in practical instances, since in handling educational and 
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psychological data, obtained values are seldom converted into 2 scores. 
It is possible, however, to modify the z score regression equation into a 
form suitable for predicting values in Хо from obtained values in Хі. 

It will be recalled that a z score is merely the number of standard 
deviations that a value is above or below the mean, and that л, for 
example, is (X, — M,)/s,. Designating the predicted X, value as Хо, 
(Xs — Moss is substituted for 20, and (X, — M,)/s, for z,. If the re- 
gression equation is written as 


= Го121 52 
then 


" s 
Xo = т Ху + Mo — roi 2 M, (6.3) 


This is the equation* of a straight line with slope of г0150/5, and inter- 
cept (the Хо value when X, = 0) of (M, — o1M 50/51). 

A regression equation provides a procedure for predicting a score in a 
second variable when the score in the first is known. In a particular 
sample, the means and standard deviations of the two variables and the 
correlation between them can be organized as an equation for making 
predictions for cases not in the original sample, but for which the score 
on one of the variables is available. 

Suppose, for example, in a representative sample the relationship 
between an intelligence test and success in a course of training has been 
ascertained. To a new applicant we can: (1) administer the test; and (2) 
make a prediction as to his probable training success. Obviously, the higher 
the correlation, the more closely the prediction is likely to approximate 
the criterion value, when and if the criterion value is determined. 

For a more precise description of prediction, it is necessary to consider 
the division of the criterion variance into two portions: the variance of the 
predictions and the variance of the residuals. 


——— 


4 Other forms of the regression equation are possible. To develop the deviation form, 
for example, Xo/so is substituted for Z and xi/si for гі in Formula 6.2. This yields 
Xo = (ro1so/s1)x1, a form of the regression equation that could be used for predicting 
deviations in Xo from deviations in Xi. Since, in making predictions, one may be 
interested in relative standing without predicting actual criterion values, the equation 


could be modified so that predicted values would be standard scores with an assigned 
mean and standard deviation. 
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DIVISION OF Vo INTO Vo AND Vo.1 


If any variable is transformed into z scores, its variance is 1.00. This 
variance сап be represented as Ур, which can be divided into two parts, 
V, and Vo, where Ру represents the variance of the predicted scores. 
To find the value of this variance, both sides of the regression equation, 
Žo = гүлү, are squared. This gives 

Ey = To. Zr 


Summing and dividing by N, and dropping out the variance of z,, which 


is 1.00, 
Й = го? (6.4) 


The value of the variance of the residuals, Vo, can be found somewhat 


similarly: 
201 = Zo — Zo = Zo — "0171 (бз) 
Squaring, 
2012 = 20° — 27012021 + Tor zi? 
Summing and dividing by N, 
2 
Feast e = 270: =з + ғы 5 


Considering that Xzoz,/N = ro, and that the variance of a set of z scores 


is unity, it can be stated that 
2 6 
Vo, = 1—2rorror + То: =! Toi (6.6) 


It is now apparent that 
(6.7) 


Generalizing, it can be said that any z score, Zo, can be divided into two 
Portions: the predicted value Žo, and the residual 20.1: The predicted 
score 20 is correlated perfectly with the predictor variable д, Since it is 
merely z, multiplied by a constant. The residual Zo., is uncorrelated with 


zı. This can be shown by multiplying Formula 6.5 by zı, with the result: 


2 
2120. = 2021 — "0121 


ил + Vo — 1 = № 


Summing and dividing by №, 


iz," 
Халол Хә _ HE roy c Tos = 00 
N N 


Since the sum of products of 2, and 20.1 is zero, their correlation is 


zero. 
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With original Хо values, V, in general does not equal 1.00, but the 
principle that the total variance is divisible into two uncorrelated portions 
is still applicable; that is, 


V, = Po + Vo. (6.7a) 


PREDICTABLE VARIANCE OF THE CRITERION 


Of the total variance of the criterion, represented as Vp, a certain 
proportion is generally predictable from an outside variable such as Xi 
If го, the correlation between Хо and X, is zero; then no part of Хо is 
predictable from X,. On the other hand, if ro, is +1.00, then Хо is 
completely predictable from X,. These are the two extremes. қ А 

Since with z scores, Vo = ro,?, it follows that Й/Йо = roi He To. 
If original values are used, the predicted variance becomes ro; Vos in 
which V, refers to a variance in original units. Accordingly, the proportion 


of the variance predicted by X, is ro,? Vo/ Vs or ro,?, just as with the z score 
formulation. 


UNPREDICTED VARIANCE 


The same reasoning applies to the unpredicted variance. The portion of the 
variance of the criterion that is unpredictable from a knowledge of the 
predictor is Vo ү. In 2 scores reduced by Formula 6.5, V, , is (1 — ro”), 
and in original units it is Vo(1 — To”). In either case, Vo ı represents the 
variance of the residuals around the regression line and (1 — го?) is the 
proportion of the criterion variance unpredictable from the independent 
variable. 

This variance of the residuals is also known as a partial variance, and its 
Square root is a partial standard deviation. It reflects the variability that 


remains in Хо when the variability associated with X, has been removed by 
Subtraction. 


HOMOSCEDASTICITY AROUND THE REGRESSION LINE 


Figure 6.2 represents a two-variable scatter diagram for 49 cases (artificial 
data). The two scales are such that Vo = V, = 8. Since the two standard 
deviations are equal, the slope of the regression line LL' is rogi. It has а 


numerical value of .707. As in most correlation diagrams, this line is not 
actually shown. 


From column to column in Fi 


g. 6.2, the variability is the same, since 
there is one X, unit 


between successive cases. In each column the mean 
would be exactly on the regression line. Accordingly, a straight line is the 
line of best fit, and the regression is linear. Also, the variance of the values 
in each column is exactly 4. The vertical arrays show homoscedasticity, 
meaning that the variability from column to column is the same. 
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It will be seen that there are seven different X, values and that corres- 
ponding X, values (values exactly on the regression line) are 3, 4, 5, 6, 7, 
8, and 9. The variance of these Хо values is 4. Formula 6.6a holds, and 

Vo = Vo + Vo174T4-8 


Again the variance of the criterion has been divided into two portions: 
the predicted variance Vo, and the unpredicted or residual variance Vo ү. 
In the specific instance, V, = Vo.» but this happens only when r = .707. 
When r is greater than or less than .707, the predicted variance is greater 
than or less than the residual variance. 


xi 
Xo о УЗ 2v3 3У2 4У2 5У2 6У2 


> 


| 


| 


— oU Ro UO OO tC Ro М – 


© ою ш л суз: 


fi 1 7|N-49 


FIG. 6.2. SCATTER DIAGRAM SHOWING 
VERTICAL DIRECTION 


HOMOSCEDASTICITY IN THE 


there are seven cases in each column (or 
e 49 cases is perfectly linear, the 
sponds to the Хо value for 


In this artificial example, 
vertical array). Since the regression for th 
mean of each set of seven cases corre 
the column and falls exactly on the regression line. The variances of the 
Seven sets of seven cases each are identical. Accordingly, the seven standard 
deviations of the “errors of prediction” (that is, values obtained by sub- 
tracting predicted values from observed values) are all identical. Thus, in 
predicting Xo from X, the bivariate distribution 1s homoscedastic around 
the regression line. А 

If an observed sample of cases is taken as representative of cases not 
yet observed, then (on the further assumption of homoscedasticity) the 
standard deviation of the residuals in the sample can be taken as the 
standard deviation of errors in making predictions of X values when only 
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X, values are known. It is then called the standard error of estimate. The 
formula is easily developed. In z form, the variance of the residuals, 
Уо, is 1 — го. To find this variance in raw-score form, (1 — Toi?) is 
multiplied by the variance of the criterion in raw-score form, which yields 
Vo(1 — 12). The square root of this variance is the standard deviation of 
the residuals when X, is predicted, and becomes the standard error of 
estimate, denoted as з, о. Thus 


Sot = Sesto = Sov 1 — rà; (6.8) 


in which the subscript 0 refers to the criterion variable, and the subscript 
i refers to any predictor. 

If the errors in fitting the straight line connecting the two variables have 
widely differing standard deviations within the columns, Formula 6.8 will 
not be a good estimate of the standard deviations of the errors of predic- 
tion in different parts of the range. However, when the sample used in 
computing r is representative of samples yet to be studied, when regression 
is linear, and when homoscedasticity can be assumed, the formula for 
Sesto helps in the interpretation of r. The smaller Sesto» the more accurate 
the predictions. When r is 1.00 (or — 1.00), there is no error in estimation; 
and, at the other extreme, when r is :00, the standard deviation of the 
errors equals the standard deviation of the criterion. It is to be noted that 
the sign of the correlation coefficient has no effect on the standard error 
of estimate. Prediction on the basis of a negative r is just as effective as 
Prediction on the basis of a positive r of the same absolute magnitude. 


THE COVARIANCE AS AN AVERAGE 


In connection with the derivation of ғ it was noted that X(z; — 20)7, the 
sum of the squares of the deviations from the regression line, is minimal 


when the slope of the line is Ezoz,/N. This gives the basic formula (from 
Formula 6.1) for r as 


— 22021 КЕ — Ше; 
ог (6.1a) 
_ 22,2, 
xy N 


Inspection of this formula shows that a coefficient of correlation is the 
arithmetic mean of the products of pairs of z scores in the two variables. 
If each 20 equals its corresponding z,, then r = Xzoz,/N = Хгого/М = 1.00, 
which is the upper limit of the correlation coefficient. If each 20 equals its 
corresponding z, in absolute magnitude, but is opposite in sign, then 


r = У20211№ —Zzy( —z9)]N = —Ez|N = —1.00, which is the lower 
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limit of r. If the 202, terms do not vary together, but sum to zero, then 
r= .00, indicating no linear relationship between the two variables. 

When deviation scores in two variables are multiplied together in 
pairs and summed and divided by N, the result is known as a covariance. 
Since a z score is a deviation score indicating the number of standard 
deviations a case is above or below the mean, r is the z score covariance. 

A covariance is one way of stating the relationship between two variables. 
When there is no linear relationship, the covariance is .00. However, the 
maximum value of a covariance is a function of the standard deviations 
of the two variables concerned. Accordingly, a covariance is generally 
interpreted as a proportion of a variance, or it is converted into a cor- 
relation coefficient by division by the standard deviations of the two 


Variables from which it derives. 
Since covariances can sometimes be added together and sometimes 


fractionated into components, they are crucial in the study of relation- 
Ships involving more than two variables. The concept of a correlation as a 


Covariance of z scores with mean of zero and variance of unity is particu- 
larly convenient in the formulation of multiple and partial correlation, as 


discussed in Chapters 7 and 8, and in the treatment of sum variables (that 
IS, variables obtained by adding case by case the values on two or more 
Constituent variables), as discussed in Chapter 9. 

COMPUTING FORMULAS FOR r 


Formula 6.1a can be easily modified to find r from deviations in original 
Score units. For each z, its equivalent, x/s,, is substituted. Then, 


_ P (6.1b) 


g the constants s; and s; 
since the-sum of a variable 
nt times the sum of the 
mula 6.1b becomes 


The final expression is obtained by placin 
Outside the summation sign. This is justifiable, 
With each term multiplied by a constant is the consta. 
Variable. If the variables are denoted as x and y, Рог 


-—: (6.1c) 
9 Маз, 


Neither Formula 6.1a, пог 6.1b, пог 6.1с is very useful for compu- 
tational purposes. Both z scores and deviations require conversion from 
the original values. Since the mean is generally a decimal fraction, each 
2 score and each deviation is also a rather awkward decimal fraction. For 
&reater accuracy, and especially for use with calculating machines, formulas 
in terms of original values are far more convenient. 


££6 LI — 4x +X) 
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EXE + NRE + CCo X HNZ (x +905 = e YR И сө ы. ос pmo 
219619 9018 — 165Е = 1591 6II— —9I€ — 161 61I— = (x —'x)x ole — xx 
TXI ККЕ SX =X — X) x —»xXx-'Yx-ux ЕОР = XXX L6l —'X& 
5у2әу2 suing’ 
19€ 61 24 p 8L 691 9€ £I 9 ‘WTM 
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RAW-SCORE FORMULAS 

A raw-score equivalent for Formula 6.1c can be developed by noting that 
the deviation x; is (Y; — M;) or (X, — ХУМ) and that x; is (X; ХУМ) 
Then, 


УХ! EX; EX, EX; EX,EX, 
7 es = ХХ ХХ, а 251753 
x) x) хх; iW NINN 
Summing and dividing by N, 
Ухх, IXQX, УХ,УХ; IX,ZX, IX,EX; МУХ,Х;– EX TX, 
N N NN NN NN N? 
Dividing by s,s, and their equivalents, as given by Formula 5.8a, so as to 
build the left-hand term up to Formula 6.1b, 
Ххх; NEX;X;— УХХ, (69) 
жт,,= —— 
Nss; Ч  JNXXj! — (EX VNEX,? — (ЕХ)? 
To utilize this formula, we need N, the number of cases; ХХХ, the 
Sum of the cross-products; XX; and EX; the sums of the values in the two 


variables; and БА? and EX, the sums of the values squared. The use of 
the formula is illustrated in Example 6.1. 


XX; = (x. - 


EXAMPLE 6.1 


ALTERNATE FORMULAS IN THE COMPUTATION OF r FROM RAW SCORES 


In finding correlations from raw scores, each X is, in effect, taken asa — 
from 0 in terms of a step interval of 1. Means of cross-products and means о! 


Squares are, in effect, corrected to find covariances and variances. 

In Table 6.1 are shown the individual values of Xi,Xj, X:?, А, XiX5, (Xi — X), 
Qa — X)*,0X. + Xj) and (X: + Х)°. Since only sums of these values (readily 
Obtained with a desk calculator) are of interest, there is actually no need to write 
Out the individual values of squares or cross-products. The information on 
differences, sums, and their squares is presented to show certain checking pro- 


cedures and to illustrate the use of alternate formulas. 


Computations of Variances and Standard Deviations 


к= NZX? — (2X)? _ (20 х 3591) — (197)? . 33011 — 82.53 


N? 400 

sı = V V, = 9.08 

y; = NXX?—Qx) _ (20 x 6196) — (316)? _ 24,064 2046 
М? 400 400 


зу = VY, — 7Л6 
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V. N£X(Xi— Xj? -(XX,— X)? (20 x 1641) — (— 119)? _ 18,659 
i-j = 


= 46.65 
NE 400 400 

NECK + X)? —(XX,- X (00 x 17,933) — (513% _ 95,491 _ 439-95 
Vid N? 400 ~ ~400 | 
Computation of r 
By Formula 6.9, 

NEXX, — XXE X; Е (20 x 4073) — (197 x31 —  — 

"O INEXES (EX VNEXR-(EXQ VQ0 ж 3591) (979 У (20 x 6196) — (316° 

19,208 

7 28,185.57 E 


By Formula 6.10, 


И И Va-n _ 82.53 + 60.16 — 46.65 96.04 


ш 2513; = 7 2x9098x776 14092 9? 
By Formula 6.11, 

„= Yun И И _ 238.73 — 82.53 — 6016 9604 _ 
m 255) 239.08 x7.76 140927: 


Generally, only formulas involving sums of cross-products, such as Formula 
6.9, are used for computing r from raw scores. If each set of X; and X; is entered 
simultaneously in the keyboard of a desk calculator that has, say, seven places 
between the decimal positions of each, and then multiplied by the same X; and 
Ху, (also with seven places between the decimal positions), then in repeated 
Operations UX; and Ў Ху will accumulate in the multiplier dials (with seven 
places between decimal positions), and УХ, 25 Х.Х), and £X; will accumulate 
in the product dials (also with seven places between decimal positions). (Some 


machines have squaring devices that permit keyboard entries to be used both as 
multiplicands and as multipliers.) 


Formula 6.9 is also generally used when the intercorrelations of a number of 
variables are needed. If the variables are Xi, Ху... Xn, then XX; and £X: need 
be found only once for each variable (except perhaps for checking), while each 
Sum of cross-products (such as XX(Xj) is found again as XX;X«. If cross- 
products check, it is generally assumed that sums of squares are correct. 


FORMULAS FOR CODED SCORES 


A parallel development yields a formula for r in terms of scores coded as 
x’; and x', the number of step intervals from “assumed means." The 
derivation involves using Formula 5.13a to set up equivalents of x, and x; 
the true deviation scores in the two variables, as follows: 


xf. Ч. =x’; 
XiX;—d Xi N ij X; N 
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in which i, is the step interval used to code variable i, and i; is the step 


interval used to code variable j. 

Multiplying terms, summing, writing equivalents, and dividing by the 
standard deviations (in the form given by Formula 5.15a), the following 
formula for r is obtained: 


"m 
ту = shoal at Mad € = (6.9а) 
Мх 2 — (Ex! VNEx'? — (Ex') 


If the derivation of this formula is written out in full, it will be seen that 
i, and i,, representing the two step intervals, drop out from both the 
numerator and denominator. The use of this formula? is demonstrated 


in Examples 6.2 and 6.3. 


EXAMPLE 


THE COMPUTATION OF r FROM A SCATTER DIAGRAM 
(Conventional Method) 


Theory. While the computation of r, using original raw scores, may be regarded 
as yielding maximum accuracy, there is actually little loss in precision if the 
Correlation is found through the use of coded scores. The coded scores demon- 
Strated in this example are of the kind employed in Example 5.3 for the computa- 
tion of the mean and the variance. Here, however, two variables are treated 
simultaneously, instead of only one. While not needed in solving for r, means and 
standard deviations for the two variables can be readily found from the scatter 


diagram (Table 6.2). 

Preparation of Scatter Diagram. It is sometimes stated that 10 to 20 steps are 
needed to code each variable if the obtained correlation is to be a close approxi- 
mation of the r obtained from raw scores. However, empirical studies have shown 
that as few as nine steps can adequately represent a variable when / is large, 
Say, 1000 or more. 

Table 6.2 shows the scores of 20 first-year law students in two tests: vocabu- 
lary and reading comprehension. As explained in Chapter 5, the high and low 
Scores (marked H and L) are used in determining the approximate number of 
Categories any step interval will yield. Here, only five steps are used in X and 
only six in Y. For serious research, both the N of 20 and the numbers of cate- 


Bories would, of course, be regarded as inadequate. 


——— 

5 Few formulas appear in different statistics texts in as wide a variety of guises as this 
One. Sometimes numerator and denominator have been divided by N? so that the first 
term in each pair appears divided by N; the second term, by N?. Sometimes Xx'/N 
appears as c, a correction term. Often, terms involving x’ appear as df and terms in- 
volving x’2 as ау, the d representing the deyiation in terms of step intervals and the f 
Tepresenting frequencies. However, all formulas involving summation of cross-products 
of coded scores are fundamentally the same. 


6.2 
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TABLE 6.2. SCORES OF 20 INDIVIDUALS ON TWO TESTS 


LEVEL OF LEVEL OF 
VOCA- ^ READING VOCA- READING 
BULARY COMPRE- BULARY COMPRE- 
TEST HENSION TEST HENSION 
INDIVIDUAL. (Y) (X) INDIVIDUAL (Y) (X) 
B.A. 46 23 S.K. 56 24 
D.A. 42 27 GL. 42 21 
Р.А 54 24 DL. 50 30 
H.B. 44 24 M.L. 59 (H) 28 
R.B. 43 20 P.L. 34 (L) 18 (L) 
А.С. 48 28 B.S. 48 19 
R.G. 48 22 Ls. 56 30 (H) 
A.H. 47 23 R.S. 43 24 
BJ. 37 21 A.T. 39 23 
DJ. 45 18(L) F.W. 49 27 


The 20 Scores Tallied in a Scatter Diagram 


Y 


VOCABULARY 
TEST, X LEVEL OF READING COMPREHENSION, iz = 3 


iy =5 18-20 2123 2426 27-29 30-32 
55-59 І | | 


50-54 ТІПТІ 
45-49 || ПІ | 

40-44 | | || 

35-39 i 

30-34 | 


It is readily seen from Table 6.2 that each tally mark represents two scores, 

one in Y and the other in X. Thus, the first pair of scores, for B.A., is represented 
as one of the three tallies in the cell corresponding to a Y score of 45-49 and 
an X score of 21-23. In effect, it is coded at the two midpoints, 47 and 22. As N 
and the numbers of categories increase, the grouping error in a scatter diagram 
becomes negligible, just as with a frequency distribution. 
. In making a two-dimensional plot of tallies corresponding to two sets of values, 
it is often convenient to use a straightedge on which are indicated the step limits 
of X. As this straightedge is positioned in accordance with the Y value, the 
correct placing of the tally within the proper cell is facilitated. 

Computation of r. Finding a correlation coefficient from a scatter diagram is 
demonstrated in Table 6.3. The following notation is used: 

d; — deviation in step-interval units from M’z, the assumed X mean. 
d, — deviation in step-interval units from M’,, the assumed Y mean. 
dz dy = the product of the X and Y deviations (both in step-interval units). 
fx = frequency of any X step. 
fy — frequency of any Y step. 
fry = frequency within any cell. 
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TABLE 6.3. COMPUTATION OF rzy FOR 20 INDIVIDUALS ON TWO TESTS 
(Together with Mz, My, sz and sy.) 


X 
18- 21- 24- 27- 30- 
x 20 23 26 29 32 | fy dy dfo а а:йш 
55-59 (10) eo ы 3 5 15 = 5 
——— RAD 
50-54 1" v°l2 а 8 2 ж 
4549 | 2 | 3° 2 т\з A 6 2 
40-44 1 е jd 19 5 2 10 20 16 
35-39 a” 2 1 2 2 2 
30-34 1 1 0 0 0 0 
LX ШШЕ БЕНИ eee Ке 
Ж 4 6 4 4 2 N=20 Sums:56 192 114 
(Ey) Quy?) xy) 
4: 0 1 4 Sums: 
ај. 0 6 8 12 8 34 (Хх? 
dif. 0 6 16 36 2 90 (Хх?) 


By Formula 6.92, 
Ave NXxy —XxLy 
УМУ? — (Ex V NEY? — Оу 
Е (20 х 114) — (34 х 56) 
~ 720 x 90 — (34)° У 20 x 192 — (56)? 


376 


-————— =.56 
V 644 V704 
By Formula 5.13, 
— Д 
Ny Moa E 2194222 =24.1 
Уу’ 5 х 56 
м, = y, 422 46.0 
y = M'y4- N =32 + 20 


By Formula 5.15a, 


iz 3 
== VNEx? — (Ex)? = у 4/20 x 90 — (34)? 


B genes 
= V 644 = 3.81 


iy 5 
Sy = y V Ny: Gy? = әб 4/20 x 192 — (56)? 


DW 
= 39 У 704 = 6.63 
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With conventional numbers of categories, say, ten or more, the assumed means 
are often taken near the center of each distribution, thus reducing the numerical 
size of the constants entering into the formula for r. Here, for convenience and 
to avoid all negative values, the assumed means are taken as the midpoints of 
the lowest step in each distribution. 

Step Frequencies. The fz and f, are found by summing the cell frequencies 
(/zy) in columns and in rows. Then Zfz = Xf, = М. 

Step frequencies are multiplied by corresponding d's to form dzfz and dyfy 
products, which are then summed to find Хх” and Ly’, respectively. In Table 
6.3, Хај. = Ux’ = 34 and Ldyfy = By’ = 56. 

The d:fz are multiplied by the corresponding dz to form the dz*fz products 
which, when summed, yield Хх? (in this case, 90). Similarly, Ху” is found as 
Уа}, or 192. 

Xx'y' is found as Xd: dyfzy. In each cell the product of the two d's is shown in 
parentheses. When this dz dy is multiplied by the corresponding cell frequency 
(Ға), the sum of the products is the sum of the cross-products of the coded 
scores. In Table 6.3, Xd: dyfz, = Xix'y' = 114. 

Below the table, Formula 6.9a has been applied to find rzy, Formula 5.13 to 
find Mz and My, and Formula 5.15a to find sz and sy. 

Checking Procedures. The bivariate distribution of Table 6.2 may be checked 
by making a completely new diagram or by placing a dot at the end of the 
appropriate tally as each pair of scores is examined. A good way of checking 
the computations is to rework them on a new copy of the scatter diagram, using 
a different M' in both X and Y. 

Comparison of Computations with Raw and Coded Scores. Although N is small 
and steps are few, results by coding (Formulas 5.13, 5.15a, and 6.9a) do not seem 
to differ much from results by raw-score methods (Formulas 5.1, 5.8a, and 6.9), 
as shown in the table below: 


RESULTS WITH RESULTS WITH 
CODED SCORES RAW SCORES 


Mz 24.1 23.7 
My 46.0 46.5 
Sx 3.8 3.6 
Sy 6.6 6.3 
rey 156 58 


A Second Problem. A bivariate distribution based оп 582 cases is presented in 
Table 6.4 together with computations leading to r,,, M,, Mp, Sz and Sy 
Because the assumed means are chosen near the sample means, a large propor- 
tion of the d is negative; and the resulting negative values of dzfr, dyfy, and 
dz dyfzy are summed with proper attention to sign. All values of dz?fz and dy?fu 
are, of course, positive. 


Steps in preparing a scatter diagram, such as the one shown in Table 6.4, 


vo enum Ө РЕ S79 016 019 88 18 EL шї 00 9ІР 056 910 86 E 
— "a РА :$шп$ LS 501 800 +00 и 18 £L— 98— S= %01- OL— 9£— vI— fp 
06ЕС+ = XX (x) GG) E = — - € 
LIZ— “Lode tt ТТ BOR Gd og т БО ф des WR & des P 8-02 4 
‘suns 285 =N I 6 St LS 89 TL 18 08 EL tv St 9c TI 9 € T 
LL 86 t*I- 4- X "mim 16-82 
= Lo 
9€ 9 9- 9-1 cat SETE 
i = T I 1 [^ 
001 S st— s- L ME e в co | oo! GEHE 
8 091 887 w=- = 8I EE s- | Е - а? cx 90) ! on о! trop 
= 1 © 
vc 861 906 (01- t— ФЕ = (6-» le | ke- 9 аб ©? wo (D й eu d Levy 
tr 961 82 vtl— (- 0 Ti "E caa] ңе? ©? ч "n e»! 15-8 
14 
r= 88 т Я- І- т má Ke-) ШЕ: Tm br má or "i o o 55-(5 
0 001 I © 5 8 [4! Oc| 81 tl 6 9 t H I 65-95 
С £I vl $c 8Ц с 6 v I 
6c— 8L1 101 101 I 101 o! | e | e| o w | w a- е) |«=› lea |5) 9-09 
ЕЕ о 
tp TEE 80E vel [4 LL e»! on’ "d o ө? at st (с—›? 8-5 ө-› md 5-9 
sr ӨР ФР BEI Е 9v ev’ рео? | «ew | че ө) а! et [ше 6- 14-89 
L T 
009 08 Оой id oe өю! вй? (90 «v! o i 6-0. 
590 006 09 s u wo! ва? ваб «е | on! 6494. 
1 1 t| c " 

981 ZST Ф 9 L аз | op wo | ар £8-08 
800601 Lot "m 18-48 

(22) (+) 612 607 661 681 641 691 651 6РІ 61 6cI 61 601 66 68 6L 
жИрер Sipp “Др "fp Өр ^J |-otz -002 —061 -081 -OLI -091 -OSI -OPI -0Е1 -OZI -011 -001 -06 -08 -oL 4 

Ж 


(ares Surpear = д :uorsuaua1duioo Surpear = x :Куѕләлип р Зимәшә иәшцѕәј 786 = М) 


(GOH13W 1VNOILN3ANO2) WVHDVIC H311VOS V ИОНУ 7 JO NOILVINdWOD э 31891 
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By Formula 6.9a, ies NZXx'y — Ex'Ey' 
V NEx? — (5х) V Ny? — (Ey? 
(582 x 2390) — (385 x 185) 
V/582 x 4441 — (385)? V 582 x 3139 — (185)? 


10 x 385 
= 144.5 Bg = 151.12 
4 x 185 

582 


By Formula 5.13, izEx' 


Mz = Ме + = 


My = Му + D zs 5755 a — 58.77 


By Formula 5.15a, 


10/582 x 4441 — (385)? — 26.82 


iz VNXx? (Ex)? 585 


5 = у 
Similarly, 


4 S32 3110 — amà 
x = = 9.20 
зу = сез V582 х 3139 — (185)? = 9 


and in solving for r, follow: 


1. Choose iz and iy, the two step intervals, so that appropriate numbers of cate- 
gories result. 

2. Make and check the bivariate distribution. 

3. Sum the fzy’s in the rows to find the fys and in the columns to find the 
Ses. Check: Ef, = Xfy = М. 

4. Find Хх” as the sum of the fz’s multiplied by the 4-5 and Әу” as the sum 
of the f,’s multiplied by the dy’s. 

5. Find Хх? as the sum of the d:fz's multiplied by the dz’s and Ху? as the 
sum of the dyfy’s multiplied by the d;'s. (When a calculating machine is being 
used, writing of individual products such as the d;f;'s should be avoided. In 
that case the fz’s сап be multiplied by the squares of the d's; that is, by the 442%, 
and products accumulated). 

6. Enter the dzdy,’s in the individual cells, with proper attention to sign, and 
multiply by the fzy’s. The sum of the products is Dx’y’. 

7. All work should be checked before applying formulas to find r and other 
constants. In both variables, Charlier's check (as described in Chapter 5) applies. 
There are also more or less obvious alternate ways of finding Xx/y', such as 
multiplying the fz,'s within rows by the corresponding dz’s and recording the 
Sums, row by row; and subsequently multiplying each of these summed dzfzy's 
by the d, of the row. The overall sum will, of course, be Edzdyfzy or Xx/y'. 

Another method is to prepare a new diagram, using new assumed means. 
2x’, Ly’, Хх?, Dy’, and Zx'y' will have different numerical values, but if the 
arithmetic is correct, the correlation and other constants will be precisely identi- 
cal in the two sets of computations. 
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EXAMPLE 


THE COMPUTATION OF r FROM A SCATTER DIAGRAM 
(Cumulative Method) 


Purpose. This cumulative method of finding r takes advantage of economies 
effected through the use of a calculating machine. It also provides an unusually 
convenient system of checks for sums, sums of squares, and the sum of the cross- 
Products of the coded scores. 

Comparison with Conventional Method. Both preparation of the scatter diagram 
and final computation of r,,, М., M,, Sz and s, are exactly the same as in the 
conventional method. If identical M' are used, then Хх, Ly’, Ex”, Уу, and 
Хх” will also be identical. The chief difference is that much of the multiplica- 
tion is accomplished by addition, as in one of the methods of finding M, and s, 
demonstrated in Example 5.4. 

Finding the Cumulative Frequencies. Cumulative frequencies, rather than fre- 
quencies, are found for both rows and columns. All work in either the Х or Y 
variable starts as far from the origin as possible and is carried toward the origin. 
Starting with the first row in which there are tallies, the cell frequencies are 
added into the adding machine or calculator and the total of the cell frequencies 
in that row is entered in the column headed Cfy. This sum is not cleared from the 
machine, but is added to the cell frequencies in the row below to form the cumu- 
lative frequency for that row. If a row has no tallies, the Cf; is the same as that 
of the preceding row. The Cf; of the bottom row is necessarily N, the number of 
cases. The procedure is readily apparent from an inspection of Table 6.5, in 
which the successive Cf, are 3, 5, 12, 17, 19, and 20. 

By a process exactly analogous, the cumulative frequencies in X are found. 
Tallies or fzy’s in the column farthest to the right are added to find the first Cfz. 
Without clearing the machine, the tallies or fzy’s in the column to the left are 
added to find the next Cf», and so on across to the column containing the X 
origin, the Cf; of which is N. Thus, in Table 6.5, the Cfz's, beginning at the left, 
are 2, 6, 10, 16, and 20. . 

The Computation of Ху”. To obtain Ху”, the Cfy's are added, excluding the entry 
in the step that contains the assumed mean. This method of computing the sum 
Of the deviations in terms of step intervals from the arbitrary origin takes 
advantage of the principle that the sum of a series of cumulative frequencies is 
equal to the sum of the products of each frequency times its deviation from the 
origin in terms of step intervals. This fact is easily noted from the algebraic 


example below: 
ty dy dyfy Cfy 


a n na a 


h 3 3h ato th | 

ү а 2і афо. 

j 1 j ates thtits 
ECfy =па+ ++ +3h+2i+j 


6.3 
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The frequencies are indicated in the column headed fy and are a... ^, i, j. 
The column headed dy gives the deviations in terms of step intervals from the 
arbitrary origin. The column headed dyf, gives the products of the deviations in 
step intervals as obtained in the ordinary multiplicative method of computing 
the mean from an assumed mean. The column headed Cf, gives the cumulative 
frequencies. Since there are n of these cumulative frequencies, in all of which a 
is represented, na will be a part of ХСУ,. It is readily apparent, that irrespective 
of the number of terms or the values of the frequencies, the sums of the two 
columns d,f, and Cf, are identical. In summing the cumulative frequencies to 
obtain Ху”, care must be taken not to include N in the step containing the 
origin. 

Computation of у?, To compute Уу", the Cf, are multiplied by the successive 
odd numbers, beginning with unity in the step above the one that contains the 
assumed mean. The sum of these products is Ху”, This method of computing 
the sum of the squares of the deviations in terms of step intervals from the 
arbitrary origin is an application of the fact that the sum of a series of odd 
numbers beginning with unity is equal to л? when л is the number of terms in 
the series. The algebraic basis of this principle is given in the table below: 


fy. dà Cf, m mCfy 

a n? а 2n—1 (2n — 1)а 

h 9 ates +h 5 5а + 5h 

i 4 ater +hti 3 3a + +++ 43h +31 
Ü 1 ates +h+i+j 1 a+- + h+i+j 


EmCfy = па + >: +9h+4i +j 


The successive odd numbers are denoted as m, or the multiplying factors. 
It will be seen that XmCf, —Xayf,. Again the cumulative frequency in the step 
containing the assumed mean is ignored. 

_ Numerical Computation of Ўу' and Xy. When a key-driven adding machine 
15 used to compute Ly’ and Xy, two series of operations are performed: the 
Summing of the Cfy’s and summing of the products of each Cf, with its cor- 
Tesponding m. When a calculating machine is used, the two quantities are found 
In one series of operations. Each m is placed in the keyboard and multiplied by 
the corresponding Cf. The accumulation of the multipliers is UCf,, or Ly’, and 
the accumulation of products in the product dials is XimCf, or Уу. Here, 
СУ, is 56 and ZmCf, is 192. Хх and Хх? are computed similarly. Хх” and 
Ху”аге written twice in connection with Charlier's check on the sums of squares. 

In Table 6.5, ms and т? аге written out. If a calculating machine is used, 
the operator can Put m’s and m^s in the keyboard successively, and they need 
not appear on the diagram showing the work. This practice is followed in Table 


па dy's which would ordinarily appear only on 
pieces of cardboard used to facilitate computation. 
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TABLE 6.5. COMPUTATION OF r BY A CUMULATIVE METHOD 
(Data are the same as in Table 6.2) 


X 


18- 21- 24- 27- 30 
20 23 26 29 32 
dz 
4 dj,| 0 1 2 3 4 |0 m т Са 
55-59 5 1 ila 9 m 9 
50-54 4 1 115 7 9 15 
45-49 3 |2 [3 2 5 7 A 
4-44 2|1]|1]|2 17 3 5 32 
35-39 1 2 19 1 з 34 
30-34 0|1] | | [| 120 1 34 
Che 20 16 10 6 2 | (N) (Ex) 
(N) 
m 5 T 
т i3 5 7 9 
Суйу |56 48 35 2 9 
(Ху) 


ECYd,f., = СУ = Ex'y' = MAG) 
= NXx'y — Xx'Zy "T (20 x 114) — (34 x 56) 
УАУ (БҰ Уху (5р): V20 х 90 — (34)? У20 х 192 — (56? 


r 


N=20 N=20 
ECf, = Ux’ = 34(v) UChy = Ly’ = 56(v) 
Хх” = 34 Ly’ = 56 


XmCf, = Хх? = 90( v) XEmCf, = Ху? = 192(v) 


Xm'Cfz = 178 XEm'Cfy = 324 


“еу = JO ux бузу = fo ux 


(МРС = си Уошх ()6bt'8t = «хт = Souz 


660% = AX 6Sy'p = XX 
(^J6sc'p = x = fox (^)всу'у = XK = 27959 006-458 1897 = 25 
285 = № 78S = № 785 = А ГІІ = Иг £9 = #1 ѕиоцотішо? fo synsay 
868'p£ = хт = PROX = “УРО 
(43) 

в €Ol ZSE 9/8 881 1500 9590 PLIE 1098 9886 800% ФЛР Eth 150% m [рэ 

Cx (А) I Ol SE “6 01 сс IE €6£ 9% 605 #5 095 PLS 086 185 =f) 
65р 28 I I 16-82 
90% 085 1 SE-ZE 
Seb 6/5 g I I © 1 66-9: 
9090 TLS КІН I € € v £ I А I £t-0r 
ВЕЕР pss I I 3 9 8 t [4 $ © z ігі? 
8851р ozs 8 S РІ Ol 9 9 8 2 1 15-8% 
1086 88 t |t 8 01 £l ГА £I 2 9 І 55-06 
Lote PLE I [4 $ 8 zI Oc |81 +1 6 ч J€ ДТ I 65-95 
9152 PLT I T єт Ел с̧с | 81 1 6 Ӯ I I 1 £9-09 
099] ЕШ I $ St | at 9 а 01 9 2 І [4 19-9 
9L6 96 € 9 9 8 9 £I I I I I 1L-89 
6t6 0 I 9 L п v I SL-CL 
кс 0 I v € € I 6L-9L 
16 $ |I I £ © £8-08 
II I I L8-v8 

"fpx2 ЧӘ) 6с 60% 661 681 GLI б 61 6б GET бй 61 в 66 68 6L А 
701: -000 -06I -08I -OLI -09I -OSI -ObI -OEI -001 -01 -001 -06 -08 -0/ 
X 


(9 QLL ш! se ours әц әле BIE) 
(аонізіл 3AlLvinWno) WvHOVIG H31LVOS V INOHJ 4 3O NOILVIQndWOO 799 318V.L 
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Charlier’s Check. As already noted in connection with Example 5.3, the basic 
formula for Charlier's check is Xj + 1)? —Xy? + 2Zy’ + М. That is to say, if 
we drop the assumed mean just one step interval, and thereby increase the value 
of all deviations by 1, the new sum of squares will be equal to the old sum of 
squares plus twice the sum of the deviations from the old origin plus N. The 
successive odd numbers denoted by т” are designed for use with Charlier’s 
check. The procedure used in determining XCf, and XmCfy is repeated, and the 
sum of the products of the m^s and the Cfy's is entered as Xm'Cfy. When it 
appears that this value is equal to the sum of the four entries above it, a check 
is obtained on the computation of Ly’. In Table 6.5, 20 + 56 4- 56 + 192 
= 324. 

Computation of Xix'y' (working within the columns). Xix'y' is computed twice. 
Working within columns, each cell frequency (designated as fzy) is multiplied 
and the sum of these products for the column at the 


by the corresponding dy 
Without clearing this sum from the 


right is entered іп the row labeled СУ уху. 
machine, the cell frequencies in the next column nearer the origin are similarly 
multiplied by the dy’s, and the cumulative sum written in the appropriate row. 
The work is carried through the column that includes the X origin. The last 
entry is Ly’, but this entry is used only as a check upon the previous computa- 
tion of Ху”, The sum of all the entries in this bottom row of the diagram, includ- 
ing the entry in the column containing the X origin, is XCXdfzy, or Ex'y'. In 
Table 6.5, the successive entries are (5 X 1)+ (4x 3-9: 9-Gx 1)+ 
Bx 2)x (2х 1)=22; 22+ (5 х1)+(@х1)+(@ х2) = 35: 35+ @ x 3) + 
(2x 1)+(1 x 2) = 48;48 +(3 х2) +02 x 1) = 56. Ey is 56 both in this opera- 
tion and in the operation involving the Cfi. УСУ -Xx'y' = 48 + 35 + 
22 + 9 = 114, It is to be noted that the entries in the row containing the Y origin 
are not used in computing the cross-products. . 
Computation of Xix'y' (working within the rows). The process of computing 
ХУ” from the rows is exactly analogous to the process of computing Xx'y'from 
the columns. Beginning with the row farthest from the origin, work proceeds 
toward the origin in Y. Each cell frequency ( fry) is multiplied by its correspond- 
ing dz, and the product is entered іп the extreme right-hand column headed by 
CXd.f.,. Without clearing the first entry from the machine, the entry for the 
next row is computed and entered іп the allotted space. This continues through 
the row containing the origin in Y, in which case the entry is Хх” and becomes 
a check on the previous computation of this figure. The sum of the entries in 
the column, excluding the final entry, is XCXd.f., or Xx'y. In Table 6.5, 
BxDBxDex)es XHOIXDRHED-I ees 
Gx2)—2; 2x1) Qx24GxD—32 321 (1x 2)—34; and 
this entry is repeated for the final row. In computing Zx'y' from either the rows 
or the columns, it is convenient to use а strip of cardboard with the d's written 


on it. 
Algebraic Explanation of the Computing Principle in Finding Xix'y'. Consider 
value of b. When working in the 


a tally in a cell with a dy value of a and a dz v M 
columns, this tally takes on a value of а, and since cumulative sums are used, it 


appears in the CXdyfzy row b times; hence it adds ab to the value of Xx^y'. In 
a similar fashion the tally contributes ab to the CXd:fz, column. 
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Comparison with Table 6.3. Хх, Ху, Ух”, Уу", and Хху” are exactly the 
same in Table 6.5 and Table 6.3. This is to be expected, since the data are 
identical and identical assumed means are used. 

Recapitulation. In the procedure described above, the suggested routine is as 
follows: 

1. Sum all the cell frequencies іп the rows, obtaining the cumulative frequencies 
in Y, the last entry being N. 

2. Sum the frequencies in the columns, obtaining the cumulative frequencies 
in X, the last entry again being N. 

3. Obtain Ly’ by summing the cumulative frequencies down to, but not 
including, the step containing the assumed mean in Y and Ху? by multiplying 
the m's by corresponding Cfy’s. Check the results through the use of the Charlier 
formula. Repeat the routine to obtain Хх” and to obtain and check Хх”. 

4. Compute Хху” by multiplying the cell frequencies in each column by the 
corresponding dy, cumulating all results toward the origin. The last entry in the 
Tow labeled C2d,fz, is Әу”. In the rows in which each cell frequency is multiplied 
by its corresponding dz and the resulting sums are cumulated toward the Y 
origin, the C2d;fz, аге found. In this way, Ух” and Хху” are checked. 

5. The coefficient of correlation is obtained by Formula 6.9a, the means by 
Formula 5.13, and the standard deviations by Formula 5.15a. 

A Second Example. In Table 6.6, the data of Table 6.4 are reworked by 
this cumulative method. Since different arbitrary origins are used, the sums, sums 
of squares, and sums of cross-products of coded scores are also different; but 
final results are, of course, identical. 


THE DIFFERENCE FORMULA AND THE SUM FORMULA FOR / 


A coefficient of correlation may be computed from three variances (or 
standard deviations): the variances of the two variables being correlated, 
and either the variance of the differences between pairs of scores, or the 
variance of the sums of pairs of scores. 

Consider two variables, x; and ху, measured as deviations from their 
respective means. Let V, and V; be their variances and C;; their co- 
variance, all in original score units. Then, 


Xj Ху= х х; 
Squaring, summing, and dividing by N, 
Xx,—xj)! Ex? 2Xx;x; m Ix’ 
N N N N 
Writing equivalents yields 
Va-5 — Vi - 2€ + Vj 


6 Та mathematical statistics “Cov” is often used as the symbol for covariance. Here, C is 
used. While it is the same symbol that is used for the contingency coefficient, the two 
concepts should not be confused. 
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in which И у is the variance of the difference between Х; and X; values. 
Since adding or subtracting a constant has no effect on the variance ofa 
variable, the variance of raw-score differences is exactly the same as the 
variance of the differences in deviation scores. 


Solving for C;;, we have 
Wt 0 Va-n 
ШЕ — 5 
and dividing by s,s; gives 
Cij V V; — Vu- s? + j? — 52-3 
j hi pt E Ga 5j (Шыл (6.10) 
515; 2515; 2545; 


Ал exactly parallel development, starting with (x; + x;) yields the sum 


formula: 
А 2 2 2 
ies kean” W- Vj Süden 5 ==] (6.11) 
y 2545 2515) 


, Formulas 6.10 and 6.11, which are applied in Example 6.1, also appear 
in various guises: in terms of deviations in step intervals from assumed 
means and in terms of raw scores. Many published correlation charts are 
based upon Formula 6.10, which requires no cross-products and a relatively 
Small number of steps for the distribution of the differences. Incidentally, 
Formula 6.11 may be readily solved for V 44.» giving the variance of a new 
Variable found by adding X; and X; in pairs: 


= V; + Vj + 2755) = Vo V; + 2С (6.12) 


Баз) 
INTERPRETATION OF r 


ROLE OF N, THE NUMBER OF CASES 


As stated earlier, it is impossible to fin 
Observations. In that case, М = 1, and 
With two pairs of observations, r is either 1.00 o 
of Observations, r can take four different values, 
Without limit, so does the number of possible different v. 
2 restriction remains that r cannot be greater than 1.00 nor less than 
- 1.00. 
_ One exceedingly important characteristic 
liability, that is, the degree to which it can 
Computed from subsequent samples of the same size drawn at random 
from the same unlimited population. As developed in Chapter 13, reli- 
oe estimates are greatly affected by sample size. Accordingly, while 
€ size of r is independent of N, the number of cases is of crucial impor- 


t ; 49 "ine 
ance in determining the significance of r- 


d a correlation for a single pair of 
no variable has been established. 
г — 1.00. With three pairs 
and as № increases 
alues of r. However, 


of a correlation is its re- 
be expected to be stable if 
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THE ROLE OF UNITS OF MEASUREMENT 
A correlation coefficient is a pure number, completely independent of the 
units used to measure either variable. 

Any variable may be transformed linearly by adding or subtracting a 
constant, or by multiplying or dividing by a constant, or by both addition 
(or subtraction) and multiplication (or division), without in any way affect- 
ing the correlations of the variable. 

Let X' be any linear function of X, so that 


X'=bX +c (6.13) 


in which b and c are constants. 
Summing, 


EX'=bEX + Ne 
Dividing by N and writing equivalents, 
M, = bM, +c (6.14) 


Subtracting Eq. 6.14 from Eq. 6.13 and writing the deviations, x’ for 
(X' — M.) and x for (X — М,), yields 


x’ = bx (6.15) 


If we square both sides of Eq. 6.15, then sum and divide by N and 
extract square roots, it is seen that 


Sy = bs, (6.16) 


This shows that multiplication (or division) of all scores by a constant 
will multiply (or divide) the standard deviation by that constant. However, 
addition (or subtraction) of a constant has no effect. 

We can now compare the correlation of x and x’ with y, defined as any 
third variable. 


Multiplying both sides of Eq. 6.15 by y, then summing, and dividing 
by N, we have 


Хху Уху 
< Lep 6.17 
N N (6.17) 
By dividing both sides of Eq. 6.17 by s,s, so as to correct the 
covariance on the left to correlation coefficient, we have 


ху Ұху 
= r, = b —— 
N&as, 7” Мв,25, 
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Substituting bs, for s, in accordance with Eq. 6.16, rearranging, and 
writing equivalents, 


which proves that a linear conversion has no effect on the correlations of a 
variable. 

‚ By multiplying both sides of Eq. 6.15 by x and by proceeding by a 
similar development, it can be shown that and X' correlate perfectly. 


NORMALITY OF DISTRIBUTION AND CORRELATION 


In mathematical treatments of correlation, it is sometimes assumed that 


the two variables are normally distributed and that the cases in the 
arrays also follow the normal’ distribution. In Fig. 6.2, the X, distribution 
is perfectly “rectilinear”; that is, all steps have identical frequencies, and 
the overall shape of the distribution is flat. The Хо distribution is symmetri- 
cal, with concentration of frequencies in the center, but (as compared with 
the normal distribution described in Chapter 11) is not perfectly normal. 
In Fig. 6.2 the frequencies within each X, array are distributed in 
rectilinear fashion. While the distribution demonstrates homoscedasticity 
i the X, arrays (and hence is homoscedastic with reference to Xo as a 
dependent variable), it does not show normality of distribution around the 
regression line. 
In the derivation of the formula for r, no assumption was made about the 
Shapes of the two distributions or about variation in variance along the 
Tegression line. It is only in applying the standard error of estimate ІП 
Specific instances of prediction that the question arises as to whether the 
Variation arourfd the regression line is homoscedastic or heteroscedastic. 
In inter preting any single score by means of the standard error of estimate 
(as inferred from the standard deviation of the residuals), the distribution 
is assumed to be homoscedastic, since the standard error of estimate 18 


Conceived as uniform throughout the range. 


" AS A MEASURE OF RELATIONSHIP 

It has been seen that r can be defined as the slope of the straight line of 
best fit connecting two variables after their variances (and thus their 
Standard deviations) have been equalized. Indirectly, r measures the 
degree of relationship between the two variables. As the variance? of the 


NN 
? The "normal" distribution is described in Chapter 11. It is symmetrical, with the 


higher concentrations of frequencies close to the mean. , Ж 
As discussed earlier, the square root of this variance 18 taken as the “standard error 


of estimate.” 
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errors of prediction (that is, the variance of the residuals around the 
regression line) decreases, r increases; as the variance of the errors of 
prediction increases, r decreases. These facts are summarized in the 
following formula (based on Formula 6.6), in which V, is the residual 
variance in z-form around the regression line: 


ror = tV1— Vor (6.18) 


If there is no error of prediction, Ио: will be zero and r will be +1.00 
or —1.00. If the variance of the errors of prediction equals the criterion 
variance, V, , will be 1.00 and ro, will be zero. With intermediate degrees 
of error of prediction, r will have intermediate absolute values. Whether 
r is positive or negative is determined by the direction of slope of a re- 
gression line. The degree of association is independent of the sign of the 
correlation coefficient. One can predict just as efficiently from a negative 
r as from a positive r of the same absolute magnitude. 


WHAT AFFECTS r 


The correlation is changed systematically whenever the effective range in 
which it is computed is changed. If, to a certain group of cases, more cases 
are added which are either high or low in one of the variables being 
correlated, the correlation between the two variables will increase. On the 
other hand, if the upper or lower part of the group is dropped out, the 
correlation will decrease.’ "Change in range" refers only to changes in 
the composition of the sample in which the correlation is computed. 
Such changes have a marked influence on values of the correlation co- 
efficient, as contrasted with systematic changes in all the numerical values 
of one of the variables, or of both. Changes through linear conversion 
have, of course, no influence on the correlation coefficient. 

The change in r with the group in which it is computed means that, in 
Teporting correlation coefficients, the groups used should always be 
carefully defined. 

Correlation is also affected by changes in the internal composition of the 
variables concerned. 

If a variable, such as psychological test, is modified by dropping out or 
adding unreliable items, or by shortening or lengthening the test as a 
whole, or by changes in the conditions under which it is administered, its 
correlations tend to be altered. Any change that increases the internal 
consistency or reliability of a variable tends to raise its correlations with 
outside variables; and any change that decreases its reliability, lowers its 
correlations. 


9 Further discussion of this point, together with methods of estimating the changes in r, 
is presented in Chapter 10. 
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SUMMARY 


Product-moment correlation is a technique useful for describing the 
relationship between two variables measured on scales that have values 
which can be added and multiplied. The joint function of two such variables 
can be described in terms of: (1) a covariance; (2) a correlation coefficient; 
or (3) a simple regression equation. 

As the mean of paired deviation products, the covariance is useful 
chiefly as an intermediate step in finding other statistics. For example, 
when divided by a variance (as described in Chapter 7), a covariance 
becomes a regression coefficient, useful in regression equations; when 
divided by two standard deviations, it becomes a correlation coefficient, 
a pure number that is independent of the original units of measurement 
and which can be interpreted as an indication of the direction and degree 
of relationship between the two variables. 

Although correlations vary from .00 to 
to ~1.00), they cannot be considered as proportions (or percentages). 
Basically, an r is merely the slope of the best-fitting, least squares line, 
after the variance of the two variables have been equalized. Somewhat 
indirectly, ғ becomes a measure of relationship by indicating (when 
Squared) the proportion of the variance in one variable predictable through 


Nowledge of the values in the other. 
, The equation of the line of best fit is a means by which unknown values 
ІП one variable can be estimated from known values in the other. The 
regression equation, either in z form or in terms of original values, is а 
Joint function of two variables of great practical importance in making 
Predictions. The problem is treated more extensively in Chapter 10. 
Another important application of correlation is in describing the 


Teliabilit i ‘able: that is, the degree to which the device used 
y of a single variable; » either from alternate 


Fa Measure the variable yields consistent results, either | Iternat 
Orms or from repeated measurements. This application 1s amplified in 


1.00 (and negatively from .00 


apter 15. кү 

The effects of “changes in range" on г are discussed in more detail in 
Chapter 10, and the effects of changes in reliability are taken up again 
їп Chapter 15. 


EXERCISES 


variables in the form of 


1, А 
Develop a formula for the correlation between two the 
standard deviation is 10. 


Scores, in which each mean is precisely 50 and each s à 
(This may be accomplished by appropriate substitution of known values in 


о 
ne of the raw-score formulas for г.) 
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2. The following z scores for two variables should be first plotted on graph 
paper as a scatter diagram. 


OBSERVATION VARIABLE X VARIABLE Y 


A 1.50 :50 
B 1.00 1,00 
[e 50 1.50 
D 00 = (50 
E —.50 =1.50 
F —1.00 -00 
G —1.50 —1.00 


(a) Fit a straight line, by eye, connecting the two variables. 

(b) Measure the vertical distance between each actual zy value and the fitted 
line. (These distances may be called the crude errors of prediction.) 

(c) Find the sum of the squares of these crude errors. 

(d) Compute the correlation coefficient between X and Y as the mean z score 
product. 

(е) Using a distinctive color, plot the best-fitting straight line by the principle 
of least squares. (This line passes through the origin and has a slope equal 
to rzy.) 

(f) Measure the vertical distance between each actual zy value and the new 
line. (These distances are the true errors of prediction, or residuals.) 

(g) Find the sum of the squares of the true errors and compare it with the 
sum of the squares of the crude errors. 


3. Consider the following z scores for two variables: 


OBSERVATION VARIABLE Q VARIABLE 1 


A 1.50 .50 

B .50 —.50 

c .00 1.50 | 
D —.50 -00 

E —1.50 —1.50 


(a) Compute гу. 

(b) Compute the five values of Zo AS "121. 

(c) Compute the five values of the residual Zo.1 AS Zo — 20. 

(d) Find the covariance between 2; and Zo 

(e) Why is this covariance .00? 

(f) Compute the variance of z, ,. and determine whether it has the value of 
(1 — 799). 
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4. 5 i 
= following are the scores of 20 candidates for a police force on two tests: 
€ police aptitude test and a space relations test: 


POLICE SPACE 
CANDIDATE APTITUDE RELATIONS 
A 89 66 
B 80 70 
€ 80 49 
D 97 67 
E 96 81 
Р 98 79 
G 91 39 
H 78 59 
I 74 45 
J 83 31 
K 99 81 
L 89 28 
M 64 19 
N 110 66 
о 106 77 
P 64 10 
Q 92 85 
R 84 54 
S 83 41 
т 84 35 


od Compute r by the raw-score product-moment formula (Formula 6.9). 
a саран ғ by the difference formula (Formula 6.10). 
(d) теа ғ by the sum formula (Formula 6.11). 

serve whether or not the following check equations hold: 


D(X + у) = ХХ + QUAY + XY: 
x:—2XXY4-ZY? 


X(x- Y} = 


s of 205 women and their 


5. Si 4 " 
Sir Francis Galton (1) reports data on the height: 
and tall (These data were 


h 3 B 
аг using three categories: short, medium, 
Po TS about the time Galton was working out 

Oday, in correlating continuous variables, one wou! 


appreciably greater than three.) 


the theory of correlation. 
ld use a number of steps 


SELECTION IN RESPECT TO STATURE 


MARRIAGE 
HUSBANDS 

WIVES SHORT | MEDIUM TALL 

Tall 12 20 18 

Medium 25 51 28 

Short 9 28 14 


yoni product-moment r and decide whether Galton’s conclusion, that 
еге was no relationship between the two variables, is reasonable for these 


data. 
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(a) By inspection, what can be said about linearity of regression? 

(b) Is the bivariate distribution homoscedastic either vertically or horizontally ? 
(c) Compute r for the total group. 

(d) Compute r for the subgroup with Beta scores below 90. 

(e) Compute r for the subgroup with Beta scores of 90 or more. 

(f) Why are the r for the truncated groups ("restricted in range") less than 


the r in the uncurtailed range? 
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THE NATURE OF MULTIPLE A 


A major aim of science is such precise description of phenomena and their 
relationships that accurate forecasts of future findings or happenings be- 
come possible. In astronomy, for example, eclipses are predicted with a 
high degree of ‘accuracy. Similarly, in chemistry it is often possible to state 
Properties of a compound before the substance is actually in existence. 
Psychology aims to understand human behavior. While it seems extremely 
unlikely that human behavior will ever be completely predictable, statistical 
techniques may be advantageously used in forecasting the behavior of 
both individuals and groups. 

The simple regression equation involving two variables, described in 
Chapter 6, can be used for prediction, but more typically in applied psy- 
chology, a criterion such as school achievement or success on a job is fore- 
cast by means of a number of measurements, typical of which are test 
scores and biographical data. The problem is to combine information from 
several sources in such a fashion that the errors in prediction will be as 
small as possible. The statistical device by which a number of predictors are 
combined to yield a single score having the highest possible correlation 
with a criterion is the multiple regression equation, summarized by the 
coefficient of multiple correlation. 
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Like all other types of correlation, a multiple correlation shows the 
relationship between two and only two variables. It is a product-moment 
Correlation which (with considerable unnecessary effort) could actually be 
computed by means of a basic formula for r. It is the correlation between 
an unmodified variable (the “dependent variable" or “criterion”) and a 
Second variable consisting of the weighted sum of scores in two or more 
“independent” or “predictor” variables (the weights of which are such 
that the correlation of the sum variable with the unmodified criterion is at 
4 maximum for the particular sample of observations used in computing 
it), 
Like all r’s, a coefficient of multiple correlation is the slope of the least 
Squares line of best fit connecting two variables of equal variability. How- 
ever, since computing methods involve the extraction of a square root 
Conventionally taken as positive, the multiple is regarded as varying 
between .00 and 1.00 rather than from — 1.00 to 1.00. | | 

Тһе criterion may be denoted as variable 0, and the predictors used in 
determining the weighted sum as variables 1, 2, 3 + n. The objective in 
Multiple correlation, then, is to determine how to weight variables 
12,3... nin such a way that the correlation of their sum with variable 0 
Will be as high as possible. In any sample of N cases, and using information 
as to the degree of rectilinear relationship between pairs of variables, there 


15 à unique solution to this problem. 


R AND THE MULTIPLE REGRESSION COEFFICIENTS | 
The multiple correlation may be indicated as Ro(12... пу in which the capital 
letter indicates a coefficient involving differential weighting of the com- 
Ponents of a sum variable so as to maximize the correlation. In this expres- 
Sion, the criterion variable is indicated as a subscript outside the е 
theses, while the predictors are shown within the parentheses. The ree 
expression (12 ... n) refers to п predictors, commas p anie ws 
Subscripts unless needed for clarity. The order of the subscripts within the 
Parentheses is immaterial. Thus, if variables 1 and 2 were the only pre- 
dictors, the multiple could be indicated either as Roq2 ОГ aS Ro | 
Regression coefficients or regression weights applied to predictor Vari- 

ables are known as beta coefficients (denoted with the Greek letter В) if the 
Variables have unit variance, as is the case with z scores. They are known as 
4 coefficients or b weights if the variables have original variances, as with 
deviation scores or raw scores. In this chapter only the betas are of interest, 
Since b weights are used chiefly in practical prediction problems, as dis- 
Cussed in Chapter 10. . . 

А f no confusion is likely to result, a regression coefficient may have a 
Single subscript indicating only the variable to which it is applied. Thus a 
Multiple regression equation with four predictors may be written as 
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Zo = fizi + B222 + Вз2з + Baza (7.1) 


in which the tilde over 20 indicates a predicted rather than an obtained 
value. 

Formula 7.1 represents a scheme for finding the most likely criterion 
value for a particular case from the four z scores for the four predictors, 
together with four constant weights, the 3 coefficients. When, for a sample 
of cases, the ten intercorrelations of the five variables are known, it is 
possible to determine the numerical values of [,, В, Ёз, and f4. In the 
sample, all zo values are, of course, known. The regression equation can 
be used for predicting 20 for new cases for which zọ is unknown but for 
which values are available for the four predictors. 

For greater precision, beta coefficients can be written with subscripts 
indicating all the variables involved in the multiple. Of the two “primary” 
subscripts (written before the period) the first designates the criterion, and 
the second designates the variable to which the beta is applied when a pre- 
dicted value is computed. The “secondary” subscripts, written after the 
period, indicate all other variables used in finding the beta. Thus, Bo2.1345 
is the beta to be applied to variable 2 in predicting variable 0 as a criterion 
when the predictors are variables 1, 2, 3, 4, and 5. Accordingly, the precise 
method of writing the betas for a problem with four predictors is 


Zo = [oi.23a21 + Во2.13422 + Воз.12423 + Ёоа.12324 (7.12) 


The “order” of the beta is the number of secondary variables. Thus, 
Bo1.234 is a third-order beta. If there are no secondary variables, as in pre- 
dicting values in a criterion from a single predictor, the beta is of “zero 
order." The n betas used in a regression equation based on n predictors 
are all of the (n — 1)st order. 

The multiple R is precisely the product-moment correlation between 
actual criterion values (іп z form, denoted as 20) and the predicted values 
(denoted as 20). However, if all the intercorrelations among the predictors 
and the correlations of the predictors with the criterion are known, this 
correlation can be inferred exactly without using the individual values of 
the criterion and without computing the individual predicted values. In 
fact, the multiple correlation can be determined before the final beta co- 
efficients are available, and hence before it is possible to find the predicted 
values. To understand the way in which this is done, it is convenient to 
employ the concept of a “residual” variable. 


PROPERTIES OF A RESIDUAL VARIABLE 


In Chapter 6, certain properties of a variable “residualized” with respect 
to another variable were described. By Formula 6.2, the value of 20 as рге- 
dicted from z, is Во2,, and by Formula 6.4, 2) has a variance of 7017. 
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By Formula 6.5, the residual (20 — Žo), or 20.1, is equal to (zo — булу), and 
by Formula 6.6, its variance is (1 — 712). 

Since predicted scores and residuals are uncorrelated, every criterion 
value 20, can be divided uniquely into two uncorrelated portions, 2 and 
20.1. Similarly, as summarized by Formula 6.7, the total variance of the 
criterion Vo, is the sum of the predicted variance Po, and the unpredicted 
Variance Vo ,. 

Any variable can be modified so that it becomes a residual variable in 
relation to any number of outside variables. It is then uncorrelated with 
each of the outside variables as well as with their weighted sum.! 

. Thus, “residualization” of variable 0 with respect to variables 1, 2, 3-еп 
yields 


0/77 20012 сеп) 
o — Boi.23 n1 — Ёог.13--п22 
n Which Zo(12-.-n) represents prediction of 20 from the best-weighted com- 
bination of scores in variables 1, 2 through л. The residual variable, 
70.12...,, is not only uncorrelated with Zo(12-.. 5), but it is also uncorrelated, 
both singly and in combination, with variable 1, variable 2, and the other 
Outside or independent variables, including variable п, with respect to 
Which it has been “residualized.” | А 
The variance of this “higher order” residual variable is (1 — Ria m) 
While the variance of the corresponding predicted scores, the 20, is 
Rota...) Just as with zero-order r, the variance of the residuals plus the 
Variance of the predicted values equals the variance of the criterion in 
2 form; that is, 
Vo = Ro en) "b Vo.12 “n (7.2) 
The N values for the N cases involved in a residual variable could actually 
e found. In practice, however, values of residuals for individual cases are 
Seldom needed; rather summary statistics are used: variances and stan- 
ard deviations describing separate variables; and covariances and cor- 
relations involving pairs of residual variables. 


20.12... 


= = Bon.12 (n= 1)2n 


FINDING THE VARIANCES AND COVARIANCES OF RESIDUAL VARIABLES 


ce of a set of residuals in “ higher- 


Formula 6.6 j 
. to find the varian 
оаа but with variance less than 1.00). 


s ed form ( that is, with mean of zero, d Р ^ 
f original, or “zero-order” variables are in z form (with mean of zero an 


i i i nd the regressi 
Variance of 1.00), then the variance of the €! ET eg y 
Ine connecting any two variables, denoted here as 0 and 1,15 Yor) 
MERE ea 


1 
A proof of this principle is given by Kendall (3). 
y Ken ro % ҮТ 
: 072018700 р а covariances in conceptualizing multiple R is discussed by 
"Bois (2). 
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This formula, however, can be improved by writing it in the notation that 
is convenient in finding variances of orders higher than the first. Since, in 
the zero-order case, correlation coefficients, betas, and covariances of z 
scores are numerically identical, Во; Со: can be substituted for roi” AB 
Formula 6.6. Since in original z form, any variance is 1.00, V; can be used 
instead of 1.00. Accordingly, Formula 6.6 becomes 


Voi71— Toi = Vo — Во. Сот (7.3) 


Formula 7.3 is cognate with a more general formula useful for finding a 
higher-order variance (or covariance) of any order from the corresponding 
variance (or covariance) of immediately lower order and the product of a 
beta and a covariance. This formula is 


E'—E- ВС (7.4) 


in Which E’ is the variance or covariance of higher order and £, f, and С all 
involve residual variables of the next lower order. 

To be specific, let i represent a variable that has been residualized with 
respect to any number of other variables, collectively represented as q. Its 
variance is thus V, ,. To find the variance of a still higher order of residuals, 
Vi a+r two coefficients of the same order as V;, are needed: a beta, 
Bik and a corresponding covariance, Сі а, which involves two residual 
variables, z; , and z, ,. Formula 7.4 then becomes 


Viatri = Ka~ Pik.aCix.a (7.4a) 


Similarly, for covariances of order (q + k) from coefficients of order q, 
Formula 7.4 may be written as 


Cij sk) = Сиз Вк„аСук.а (7.46) 


Formulas of this type become the basis of a computing routine in which 
step-by-step substitution of values in the formulas is not necessary. In 
effect, the routine replaces what would be an involved system of formulas. 
In finding R by reduction of criterion variance, as demonstrated in Example 
7.1, the first matrix? is of zero-order coefficients. From it is produced a 
second matrix with one less row and one less column, and consisting of 
coefficients involving first order residual variables; that is, variables from 
which a single outside variable has been partialed out. Another matrix is 
then produced, involving the elimination of another row and another 
column, and consisting of second-order coefficients; that is, variances and 
covariances of variables from which two variables have been partialed out. 


———— 


3 A matrix is simply an arrangement of data in rows and columns of assigned meaning. 
In correlational analysis, each variable gives meaning to a row and again to a column. 
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The procedure then continues until a matrix of one row and one column is 
e single eleme of this mati 2 the par tial variance of 
found Th t t atrix is V, — he 
0.1 n a e 


EXAMPLE71 | 


COMPUTATION OF MULTIPLE A WITH TWO PREDICTORS 


ents two predictor variables, variables 1 and 2, 
n the main diagonal. 
he covariances in 
orrelation. 


The following matrix repres 
үс : criterion, variable 0. Their zscore variances of 1.00 are i 
rm е the diagonal, in appropriate rows and columns, are t 

among the three variables, each precisely equal to the с 


VARIABLE 2 1 0 

2 1.00 20 .50 
1 1.00 58 
0 1.00 


column of intermediate beta coefficients. 
dividing the covariances of the top row 
Va) is 1.00, the matrix now 


А The first step is to compute a required 
г all matrices this is accomplished by 
y the variance in that row. Since the variance ( 


becomes d 
VARIABLE 2 1 0 
2 1.00 .20 .50 
1 20 1.00 .58 
0 .50 1.00 


The next step is to form a variance-covariance matrix of residual variables, 
Variable 2 is “eliminated” from the 


Which in this case will be 21.2 and Zo. 
matrix by subtracting from all elements not involving variable 2 the product of 
the beta in the same row (and in the first column) and the covariance in the same 

m IA in the second row and second column, 


Column (and the top row). Thus, fro: 

(20 x 20) is subtracted. From Cot, which is 58,(20 x .50) is subtracted; while 
rom V,, in the last row and column, (.50 x 150) is subtracted. This procedure 
Produces a new matrix of the variances and the covariances of variables 2, з and 
20.2 as follows: 


VARIABLE 12 02 
їл .96 48 
0.2 75 


The next step is to produce beta coefficients by dividing the covariance(s) in 
top row by the variance in that row. Since this is а 2 x 2 matrix of two rows and 
two columns only, one beta is required. It is 150, found by dividing the partial 
Covariance between z; and 20.2, Which is .48, by the partial variance of z, 2, 


7.1 
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which is .96. This partial covariance can be designated Cy, г, while the partial 
variance is Vj ,. The matrix now reads: 


VARIABLE 1.2 0.2 
1.2 .96 48 
0.2 -50 415 


The matrix is reduced again, this time eliminating variable 1.2. Subtraction 
from V, (.75) of the product of the beta in the first column (.50) and the 
covariance in the top row (.48) yields the final partial variance, .51. This is Voas 
the variance of the criterion with the variance associated with both predictor 
variables removed. 

Since the sum of this partial variance and Riaz) is 1.00, it is now known that 
the predicted proportion of the criterion variance, or К° олг), is .49. Consequently, 
Roan is .70. The same result is obtained by the use of Formula 7.5. 

Of course, in actually finding R, the successive matrices are placed one under- 
neath the other. All the numerical work described above is shown below: 


VARIABLE 2 1 0 

2 1.00 20 50 
1 20 1.00 58 
0 50 1.00 
12 96 48 
0.2 .50 75 
0.12 al 


Roana = V1 Иол = УТ .51 = \/.49 =.70 


A STRATEGY FOR FINDING Воп... пу 


In multiple correlation, as demonstrated in Examples 7.1 and 7.2, the 
variance of the criterion is taken as 1.00. It is reduced successively by the 
removal of the proportion of the criterion variance predictable from the 
first predictor; by the proportion of the variance predictable from the second 
predictor, less that portion associated with the first; by the proportion 
predictable from the third predictor, less the portions associated with the 
first and second; and so on, until the portions of the criterion variance 
associated with all predictors have been removed. 


EXAMPLE 7.2 


COMPUTING AND CHECKING MULTIPLE В 
AND THE FINAL BETA COEFFICIENTS 


Purpose. This example extends the computation of multiple R to the case of 
four predictor variables and includes finding the four final-order betas. The 
analytic solution in Table 7.1, which is exactly parallel to the numeric solution 
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in Table 7.2, can by analogy be expanded to any number of variables. Complete 
checks are provided. All elements in all matrices have identifiable statistical 
meaning, and each operation can be expressed as a formula. 


TABLE 7.1. ANALYTIC SOLUTION OF MULTIPLE A 


In the original matrix, the variances and covariances are of zero-order z scores, with 
mean of zero and unit variance. In subsequent matrices, the coefficients are in terms 
of higher-order z scores, with mean of zero but with variance generally less than unity. 


4 3 2 1 0 CHECK SUM 
= Возләз| Va Сза Сәд Cia Coa Сщо+т) 
— Воз.124 X Bs4 Vs Сез Сіз Соз Сзцо+т) 
— Вов.1за х Boa Үз Сіз Соз Corr) 
— Box.224 X Bra Vi Со Corr) 
Vo Соо+т) 


= Bos.124|__V3.4 C23.4 Сіз.а Соз.а Сзо+т).4 
—Воз.лза X B23.4 Vo.a Cis. Co2.4 C2(0+7).4 
— Во1.2за X Bia. Via Co1.4 Ci(o+).4 
ЕЕ 1Воз.4| Vo.4 Co(osT).4 
= fosas) — Vea C12.34 Co2.34 Со(о+т).34 
— Boi.234 X f12.34 Visa Co1.34 Сцовт).за 
a |Вов.34 Vo.34 Со(о-т).з4 
Computations V3.234 Co1.234 C3(04.7).234 
z Vo.234 Co(o+-7).234 
Final Check: Bo1.234 5 à 
еск: С за = Vo.123 
Шы ен ои Vo.1234 Соо+т).1234 


Multiple R: Roaes4) = V1 — Vo.1234 
Back solution for f weights as indicated in the boxes: 


Boo.134 = Воз.за g Te aa 
Bos.124 = Воз.а — Bor.234 P13.4 — Bo2.134 Bos. 
Bos.123 = Boa — Bor.234 Bia — Возазз аа — Bos.124 Bsa 


nts used in the cells of the successive 


Coefficients. The three types of coefficie 
ces denoted as V, covariances denoted 


matrices of the analytic solution are: variant 
as C, and betas denoted as £. 

In the original matrix the var 
form, with all variances unity, 
correlations, and with any beta as a covariance 


of the variables concerned, also equal to a correlation. 
Subsequent matrices consist of variances and covariances of residual variables 


in higher-order z form; that is, with means equal to zero, but with variances 
generally less than unity. Intermediate betas, used in developing the subsequent 
matrix, are formed from the covariances and the variance in the top row of each 
new matrix. In this zero-order matrix, columns and rows are designated from 
4 down to 0, with an additional column for the sum variate, 7. The arrangement 
coincides with an order of work from left to right and from top to bottom. 


iables are considered to be in conventional z 
with all covariances equal to corresponding 
divided by the variance of one 
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TABLE 7.2. NUMERIC SOLUTION 
Computing and checking multiple R for four predictors. Data are from Rosen and 
Van Horn (4) 


CHECK 
4 3 2 1 0 SUM VARIABLE 


[=.0360, 1.0000 3900  .3700 .2500 3000 2.3100 (4) 
--(.1854) x .3900| 1.0000 3500 .0800 13500 2.1700 (3) 
—(.3853) x .3700 1.0000 4400 5500 2.7100 (2) 


—(1967) x 22500! 1.0000 — 3900 21600 (D 
~~ 4.3000) 1.0000 2.5900 (0) 
i 1854| 8479 — 2057 —.0175 .2330 1.2691 (3.4) 
~(3853) х 2426 8631 3475 .4390 1.8553 (2.4) 
— (1967) х — .0206 :9375 .3150 1.5825 (1.4) 
Т 2748 :9100 1.8970 (0.4) 
[-.3853, .8132 — 3947 .3825 15474 (2.34) 
ІІ 1967) х .4325 037 :3198 1.6086 (1.34) 
~~ |.4704) 8460 1.5483 (0.34) 
Identification of Variables ‚7850 1544 9393: (1.234) 
4; SVerballécove .1967 .6661 .8204 (0.234) 


3. Quantitative score 

2. High school rank 

1. Application blank 

0. Ist semester grade point average 
Ruzsa = 1 — .6357 —.3643  Rouzsa = .60 
Bo1.234 = .1967 


-6357 -6356 (0.1234) 


02.134 = .4704 — (.1967 » .4325) = .3853 
03.124 = .2748 — (.1967 x — .0206) — (.3853 > .2426) = .1854 
04.123 = .3000 — (.1967 x .2500) — (.3853 х 3700) 


— (.1854 х .3900) = .0360 
Réi1294) = (.0360 х :3000) + (.1854 x .3500) + (.3853 x .5500) 
+ (.1967 x .3900) = .3643 


In forming each new matrix after the first, one of the variables in the pre- 
ceding matrix has been "eliminated," which means that all variables remaining 
in the new matrix have been “‘residualized” with respect to the variable **elimi- 
nated," or "'partialed out." 

Thus, the second matrix consists of variances and covariances of first-order 
residual variables (with one variable eliminated); the third matrix of variances 
and covariances of second-order residuals (with two variables eliminated); 
and so on. Betas, derived from the top row of coefficients, are written under- 
neath the V. Each beta is found by the formula: В ecIV. 

Forming the New Matrices, In forming each new matrix, the computing 
formula is 

E'—E-— ВС 
in which E' is the element (vari 
corresponding element in the pri 
but first column of that matri 
column of betas), and C is the c 


ance or covariance) of higher order, E is the 
eceding matrix, B is the beta in the same row 
X (under the variance used in computing the 
ovariance in the same column but top row. 
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When variate 4 is partialed out of the matrix, 4 disappears as a primary sub- 
script. Its appearance as a secondary subscript in all coefficients in subsequent 
matrices shows that the variance associated with variate 4 has been subtracted 
from each value of each remaining variate. 

After variate 4 has been partialed out, the variables forming the new matrix 
of variances and covariances are the following higher-order z scores: 234, 
29.4) 2,4, and 20.4. Similarly, after z,_, has been partialed out, the resultant matrix 
is of variances and covariances of second-order z scores: 2, 54, 21 ҙр and Zo 94. 
Next z, 5, is partialed out, leaving a matrix involving only two variables, z, 994 
and 2). The final, single element matrix is the variance of the fourth-order 
residuals, z, 1253: 

Check Sum. In the column headed T appear the coefficients of an artificial 
variable developed, in effect, by summing the z scores of the original variables. 
In the first matrix, values are obtained by summing all the terms pertaining to 
a variable, pivoting on the variance. In subsequent matrices, values are obtained 
just as though Т were an additional variate (except that no betas are found), 
and these computed values are compared with those obtained by summation 
within the matrix. Discrepancies in the final decimal place may be expected, but 
other discrepancies indicate that an error has been made. . А 

Finding Final-Order Betas. With п predictors, the regression equation based 
on all of them requires n betas of the (n — I)st order. In the forward solution 
one is available as the beta in the nth matrix. In the back solution, a second beta 
Comes from the (n — 1)st matrix, a third from the (n — 2)nd matrix, and so on 
up to the first matrix, which yields the nth beta of (n — 1)st order. The rule is 
invariant. Each beta of the required order is the beta in the matrix applicable 
to predicting the criterion, less the sum of the products of betas of the (n — 1)st 
Order already found and corresponding betas in the matrix. Betas are regarded 
as corresponding when they share a primary subscript. In the analytic solution, 
appropriate formulas for final-order betas are shown as equations and also as 
Operations within boxes. А £m 

Computation of R. Formulas 7.2 and 7.5 indicate that when one starts wit 1а 
matrix of coefficients of variables іп z form, the square of any multiple R is unity 
less the variance of the criterion after variance associated with all predictors has 
been removed. This relationship is the basis of Table R in the Appendix, which 
gives two-place values of multiple R corresponding to four-place values of partial 
variances. 

Checking R and the Final Betas. Roe...» found as = Vois... 2) can be 
Checked by multiplying each validity coefficient by the corresponding final-order 
beta and summing the products. That is to say, Rẹ 18... п) = 24BoiCoi, in which 
i represents each predictor variable in turn. The procedure checks both R and 


the final betas. 


Computations. In Table : У й 
corresponding to the analytic solution in Table 7.1. Variables are identified at 


the right. At the left in boxes are the intermediate betas used for computations 
within matrices as well as the final-order betas found in the back solution. 


Steps in the procedure follow: x т е 
1. Write a triangular matrix of coefficients of (n + 1) variables, with variances 


72. the numeric solution, are shown the computations 
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(1.0000) in the main diagonal and covariances (equal to r) above the diagonal. 
(Strictly speaking, the matrix has (п + 1)? terms, but since it is symmetrical, 
covariances below the diagonal need not be written.) 

2. Find the entries in the check sum column by summing down each column 
and pivoting on the variance. Thus, 1.0000 + .3900 + .3700 + .2500 + .3000 = 
2.3100; and .3900 + 1.0000 + .3500 + .0800 + .3500 = 2.1700. 

3. Find the column of intermediate betas by dividing the covariances of the 
top row by the variance in that row. Thus, .3900/1.0000 — .3900; .3700/1.0000 
— .3700; etc. 

4. Compute the entries of the new matrix by the (E' — E — BC) formula. 
Thus, 1.0000 — (.3900 x .3900) — .8479; .3500 — (.3900 x .3700) — .2057; .0800 
— (3900 x .2500) = — .0175; .3500 — (.3900 x .3000) = .2330; and 2.1700 — 
(3900 x 2.3100) = 1.2691. It is to be noted that the procedure with the sum 
variable is the same as with the other variables. 

5. Sum down each column and across each row of the new matrix, pivoting 
on the variance. Each sum should agree with the result of the computation 
involving the sum variable. As examples, .8479 + .2057 — .0175 + .2330 = 
1.2691, and .2057 + .8631 + .3475 + .4390 = 1.8553. 

6. Continue the sequence of steps until a single element matrix has been found, 
the partial variance of the criterion, after variance predictable from all the inde- 
pendent variables has been removed. Except for rounding error, it should be 
identical with the final entry in the check sum column. In this case, Vo.1234 is 
-6357; the final check is .6356. 

7. Find the multiple by the use of Table R (Appendix) or Formula 7.5. 

8. Compute the final-order betas, using the single beta in the (2 x 2) matrix 
and the intermediate betas in the preceding matrices. 

9. Check the final betas by multiplying each by the corresponding validity 
coefficient. Here (.0360 x .3000) + (.1854 x .3500) + (.3853 x .5500) + (.1967 х 
3900) = .3643 = R 


бш 
In this procedure the original matrix is considered to consist of: 


1. Zero-order z score variances in the main diagonal (all are 1.00); and 


2. Zero-order z score covariances above the diagonal (each precisely equal 
to the corresponding ғ). 


The term zero order refers to the fact that the variables at this stage are 
unmodified by the process of forming residuals. In the second matrix, the 
variances and covariances are of first-order residuals (that is, variables with 
the variance associated with one variable removed); in the third matrix 
they are of the second order, with variance associated with two variables 
removed, and so on. 

Mathematically, as explained in Chapter 16, each matrix is square, and 
except for the final matrix consisting of a single residual (or partial) 
variance, there are covariances below the diagonal as well as above it. 
However, since the original and all succeeding matrices are symmetrical, it 
is not necessary to write out the covariances below the diagonal. This 
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space then becomes a convenient place to write certain beta coefficients, 
used in applying Formula 7.4. 

The final partial variance, which is the only element in the final matrix, 
is the proportion of the criterion variance not predictable from the pre- 
dictors. It is also the variance of the criterion variable (in z form) “ residua- 
lized” with respect to all the predictors. It is found by repeated applications 
of Formula 7.4, together with a routine for finding certain required beta 
coefficients by dividing a partial covariance (the covariance between two 
residual variables) by the variance of one of the variables concerned. 

In this routine, the variable represented in the top row and first column 
of any matrix is "eliminated" in forming the subsequent matrix. By 
“elimination” is meant the process of transforming each variable entering 
into the new matrix by subtracting from each value that portion perfectly 
correlated with the eliminated variable. The residual variables so formed 


are uncorrelated with all variables previously eliminated. 


COMPUTATION OF A MULTIPLE R FROM A PARTIAL VARIANCE 


Each new matrix, as it is computed, includes the variance of the criterion 
less those portions associated with the variables previously eliminated. 
This partial variance is the proportion of the criterion variance that is not 
predicted by the "eliminated" predictors. Subtraction of this partial 
variance from 1.00 results in the predicted proportion of the criterion 
variance, and this predicted proportion is the square of a correlation co- 
efficient. The correlation is the multiple. Accordingly, to find multiple R 
Corresponding to a group of “eliminated” predictors (that is, the correla- 
tion between the criterion and a group of independent variables each 
weighted in the best possible fashion), it is necessary merely to subtract the 
Partial variance of the criterion (after these variables have been “‘elimina- 


ted") from 1.00 and take the square root of the result. | 
This is an application of Formula 7.2, which may be written as 


Roa2--m = V1 = Von (7.5) 

Formula 7.5 yields the multiple between the criterion and all the variables 
that have been “eliminated.” It can be applied at any step of a solution, 
and is also applicable to the partial variances in the diagonal of each suc- 
cessive matrix. Thus, after variables 4 and 3 have been eliminated, R; (34), 


for example, can be found from V1.34- 


FINDING THE BETA COEFFICIENTS 
As already pointed out, when only two variables are concerned, each of the 


beta coefficients (for estimating Zo, given 21, and for estimating z,, given 
zo) is exactly equal to the correlation between the two variables, ro. 
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When three variables are concerned, there are six possible regression 
coefficients, none of which is necessarily equal to any of the others. If 
variables 0, 1, and 2 comprise the system, the six betas are ffo, 2, Бог. (for 
use when variable 0 is the criterion and to be applied to z, and z2, respec- 
tively), о. and B,2.9 (when variable 1 is the criterion), and Во, and 
B21.0 (when variable 2 is the criterion). 

In the solution of a multiple correlation problem, it is by no means 
necessary to find all the possible beta coefficients, which become very 
numerous as the number of variables increases. Final-order betas are found 
by routines that correspond precisely to the solution of simple, simul- 
taneous linear equations. However, the computation of each of the betas 
used in the forward solution for multiple R can be expressed as the division 
of a partial covariance by a partial variance of the same order. Let i and j 
represent any two variables; let q represent any number of variables with 
respect to which i and j have been "'residualized." Then, in higher-order 
z form, C;j is the covariance between z, , and z; ,, both residuals of the 
same order and from which variance associated with the same set of vari- 
ables has been removed. Then 


Lee те 7.6 
Виа Via (7.6) 
and 
Cy, 
Вла = pm (7.62) 


It is apparent from Formulas 7.6 and 7.6a that the variance used as the 
divisor is the partial variance derived from the original variable to which 
the beta would be applied in a regression equation; that is, f/;; , would be 
applied to 2; in predicting 2; as a criterion, while £j; would be applied to 
z; in predicting 2; as a criterion. 

In the “back” solution, the complete set of betas required for a single 
regression equation is found. With п predictors, п betas of the (n — 1)st 
order are required. In such a set of betas, all are of the same order, all are 
based on the same set of predictors, and all are in reference to the same 
variable as a criterion. However, each is the beta applied to a different 
predictor, and the set of eliminated or secondary variables is different in 
each case. 

One of these betas appears in the forward solution, in the 2 x 2 matrix, 
just preceding the single-element matrix consisting of the final partial 
variance of the criterion. Each of the preceding matrices yields another 
final beta, each of which is a lower-order beta less one or more products 
between betas in the matrix and the final-order betas already found. 
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To find final-order betas involving four predictors, operations are sum- 
marized in the following formulas: 


Bo2.134 = Воз.за — Bo1.234B 12.34 (7.7) 
Воз.124 = Bos.4 — Bos.234B.13.4 — Bo2.134P 23.4 (7.7a) 
Bos.123 = боз — Bo1.234B14 — Возлза 2а - Bos.124B34 (7.7b) 


The computing routine involves: | E 
1. Identification of the final-order beta in the 2 by 2 matrix. This is the 


beta that, in a z-score regression equation, would be applied to the last 
Predictor eliminated. 

2. Computation of a second final-order beta from the 3 by 3 matrix. 
This is the beta that, in a z-score regression equation, would be applied to 
the predictor eliminated next to the last. It is found by subtracting from the 
beta at the bottom of the first column the product of the final-order beta 
already found and the beta in the cell next to the bottom of the first column. 

3. Computation of the other final-order betas by the same general 
method. Each is the beta at the bottom of the first column less the products 
Of the final-order betas already found and the betas used in the elimination 
Process. Always the work is from the bottom, and each final-order beta is 
always used in a row appropriate to the variable to which, in a z-score re- 
Bression equation, the beta would be applied. 


This routine is demonstrated in Example 7.2. | 
The n final-order betas of the (n — 1)st order can be checked by multiply- 


ing each by the corresponding validity or correlation between the criterion 


and the variable to which the beta is applied in the regression equation. 
The result is Roa 2... п» the square of the multiple correlation between the 


i ; 2 
Criterion and the weighted sum of the n predictors. The value of Ro(12 эт 
Must correspond with the value formed by subtracting from 1.00 the partial 
Variance of the criterion after variance associated with the п predictors has 


been subtracted. 
Algebraically, 


Roi =1-V 
vn) 0.12 --п | 
\ Воз.2-. „го + Poza o2 77 + Bona (n-1)"0n (7.8) 


TION OF VARIABLES 


SELECTING A HIGHLY VALID COMBINA 
A slight modification in the procedure for finding ple 
logical method for selecting out of a pool of predictors a limited number of 
Variables that in combination will have a high correlation with the criterion. 
In Psychological and educational research, there is often considerable 
Overlap among predictors, so that four or five well-chosen variables may 
have almost as high a composite validity as eight or ten predictors. 


multiple R results in a 
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If the variable with the highest correlation with the criterion is selected 
as the first predictor, a good start will have been made in picking an effec- 
tive team of predictors. When all variables remaining in the matrix have 
been “residualized” with respect to this first variable, the task then be- 
comes one of choosing that predictor which at that point will add most to 
the multiple. 

This second variable can be readily identified. The partial covariance 
between each predictor and the criterion is squared and divided by the 
partial variance of that predictor. The variable with the highest ratio so 
derived is the one that, at this stage, will add the most to the multiple. 
Actually, these ratios are the squares of the correlations between the cri- 
terion and the predictors *residualized" with respect to the variables first 
“eliminated” from the matrix. Hence, the identification procedure con- 
sists merely of comparing a set of r?’s. The variable with the highest 
Coiq/Vig ratio is then eliminated. 

The third variable to be used in the team is identified similarly; the square 
of each partial covariance between each remaining variable and the criterion 
is divided by the partial variance of the variable. The residual variable with 
the highest resultant ratio, and hence the highest correlation with the cri- 
terion, is selected to be the next variable to be eliminated. Each cycle adds 
a predictor to the team, and the process may be continued for as many 
cycles as one wishes. After a time, however, if the original group of pre- 
dictors have considerable overlap, the ratios may be small and the gain in 
multiple R will be negligible. When this happens, the selected team may be 
almost as useful as the original group of predictors. The procedure is 
illustrated in Example 7.3. 


EXAMPLE 73 . 
SELECTING A HIGHLY VALID COMBINATION OF PREDICTORS*4 


Purpose. A subset of n’ predictors may be nearly as valid as the total group of 
n predictors. At any stage of elimination, the variable that will add most to 
multiple R can be identified as the one having the highest Cz, ,/V,., ratio, in 
which 0 is the criterion, i is the predictor, and q designates all variables previously 
eliminated. 

Tn the original matrix, where all variances are unity, the variable with the 
highest validity has the highest Са. Vig ratio. 

Procedure. Steps are: 


1, Eliminate the variable with the highest validity. 


4 This procedure, originally developed by Wherry (8), is known as the Wherry- 
Doolittle Method. Notation and layout are somewhat revised from the original 
presentation. 
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2. Test the remaining variables by computing Со, „/ И „ for each, thus deter- 
mining the variable that, at this stage, will add most to the multiple. 

3. Eliminate this variable. 

4. Again test the remaining variables by finding С „/ Уі, for each, and con- 
tinue the process until it appears that no remaining variable will add appreci- 
ably to the multiple, as indicated by Formula 7.9. (Sometimes, however, n’, the 
number of predictors in the subgroup, is determined forreasons of convenience 
rather than on the basis of purely statistical considerations.) 


After the n’ predictors have been chosen, the sum of the square of the validity 
of the first predictor and of the C;;;/ V,., ratios of the other selected predictors 
yields R?. The related final-order betas must be found from a back solution 
involving only the п” predictors and the criterion. 

If the procedures are to be completely in the model of finding multiple R as 
demonstrated in Example 7.2, each successive matrix is arranged so that the 
variable selected for elimination (and hence for inclusion in the team of pre- 
dictors) is in the first row and column, and then this variable is eliminated 
throughout the matrix. 

. An Accelerated Procedure. Considerably faster is 
in Table 7.3. The steps follow. 


1. Each variable chosen for the multiple is eliminated i 

the variances in the diagonal and the covariances invo 

2. When a variable discovered by the C§,,o/Vi.g techniq 
the multiple, then 

(a) All covariances involving that variable are computed for the second matrix 

and subsequent matrices through the one from which it is to be eliminated. 

(b) Intermediate betas are found, as always, by dividing covariances by the 

variance of the variable to be eliminated. For convenience the betas are 


placed in a column at the left. , 
3. Entries for the new matrix are found by the regular formula, E’ = E — ВС. 


The difference from the procedure described in connection with Example 7.2 is 
that the required covariance is not necessarily in the top Tow. It is in the column 
ОГ row assigned to the variable to be eliminated. = 

In Table 7.3 it happens that the first two variables to be eliminated, 2,0 and 
2010, are in the first row and first column of their respective matrices. In the 
third matrix it is found that 25.010 Will, at that stage, add most to the multiple. 
Accordingly, covariances involving 25.10 аге computed for the second matrix, 
followed by the covariances involving 25.9,10 for the third matrix. In any matrix, 
Tows and columns involving previously eliminated variables are omitted. 

4. In the fourth matrix it is seen that 25,,ә,10 Will add most to the multiple. 
If the decision is made that only four predictors are needed, no further elimina- 
tionis required. 

ot Сдо.10/ Vo.10 T Cos.9,10/ V5.9,10 oF 


5. The square of the multiple R is r. ot С 
Сазды Vossio OF -152 + .085 + 025 + 017 =.279. While third-order betas 


could be found from the forward solution, the multiple is recomputed, final- 
order betas are found, and the work is checked in Table 7.4, using the procedures 


described in Example 7.2. 


the procedure demonstrated 


mmediately only from 
lving the criterion. 
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TABLE 7.4. COMPUTING AND CHECKING A AND FINAL BETAS FOR FOUR 
SELECTED PREDICTORS 


(Predictors Selected in Table 7.3) 


CHECK 
VARIABLE B 10 9 6 5 0 SUM 
10 2029 1,0000 .2500 .5100 3000 3900 | 2.4500 
9 2582 .2500 1.0000 .2100 2600 3800 | 2.1000 
6 1515 .5100 1.0000 2700 3500 | 2.3400 
5 1511 .3000 1.0000 .3200 | 2.1500 
0 .3900 1.0000 | 2.4400 
9.10 .2582 9375 10825 .1850 .2825 | 1.4875 
6.10 .1515 .0880 .7399 11170 1511 1.0905 
5.10 151 1973 9100 2030 | 1.4150 
0.10 3013 8479 | 1.4845 
6.9, 10 11515 
5.9, 10 1511 
0.9, 10 
5.6,9,10 
0.6,9,10 
0.5, 6,9, 10 


R=V1—V05,6,0,10= VI —.7215 =.53 
R*—1— Vo.s, 6, 9,10 = 1 — .7215 = .2785 
Bos.e, 9,10 = .1511 
06.5, 9,10 = .1723 — (.1511 х .1375) = .1515 
09.5, 6, 10 = .3013 — (.1511 х .1973) — (.1515 х .0880) = .2582 
0,10.5, 6, 9 = .3900 — (.1511 x .3000) — (.1515 x .5100) — (.2582 х .2500) = .2029 
К° = УВС = .2786 


Cross Validation. The procedure for finding multiple R obtains a maximum 
relationship between the criterion and the weighted team of predictors only in 
the sample. As indicated by Formula 7.9, shrinkage in subsequent samples is to 
be anticipated. Table 7.5 shows the process of applying the betas from Table 

-4 to a new sample of cases measured on the same variables. 

The intercorrelations of the four predictors with unity in the diagonal cells 
are regarded as a variance-covariance matrix, and the four validity coefficients 
are considered covariances. To find the correlation between the criterion and 
the weighted sum of the predictors, the following operations are required: 


1. Each validity must be multiplied by the corresponding beta. The sum of the 
products is the covariance between the criterion and the wighted sum of the 
predictors, 

2. Each of the intercorrelations must be multiplied by two betas (one for each 
predictor variable). 


3. Each variance in the diagonal must be multiplied by the square of the 
corresponding beta. 
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4. The square root of the sum of the results of the operations with the pre- 
dictors (steps 2 and 3) is the standard deviation of the weighted sum of the 
predictors. When divided into the covariance of the weighted sum with 
the criterion (which retains its standard deviation of 1.00), the quotient is 
the cross-validated multiple. 


TABLE 7.5. APPLICATION OF BETAS AND OF UNIT WEIGHTS IN 
CROSS-VALIDATION SAMPLE 


Data Slightly Modified from Sprunger (5). N = 112 


PREDICTORS CRITERION 
VARIABLE BETA 10 9 6 5 0 
10 Во, 10.5, 6, 9 = .20 100 .25 48 14 227 
9 Во, 9.5, в, 10 = .26 25 1.00 12 21 22 
6 Bo, 6.5, 9,10 = .15 48 12 1.00 23 23 
5 Bo, 5.0, 9, 10 = «15 44 21 23 100 20 
Column sums .3580 .3595 .3117 .2671 .1757 
Column sum x beta: .0716 .0935 .0468 .0401 
.1757 35 


r= Tone 09353 0468 0400 ' 


Application of unit weights (from matrix above): 
Variable: 10 9 6 5 0 
Column sums: 187 1.58 1.83 1.58 92 


92 р 
171287: 1.58 + 1.83 +1.58 ` 


In Table 7.5 these operations are performed by summing products of betas 
and coefficients, column by column; multiplying these column sums of products 
involving the predictors by the beta appropriate to the column; extracting the 
square root of the sum of these products; and dividing into the weighted column 
Sum of the validities. А 

In cross-validation, the correlation is .35, а marked drop from the obtained 
multiple of .53, In this study, the coefficients are not stable, and the highest 
validity, .27, for any of the four predictors in the second sample is less than the 
lowest validity, .32, in the first sample. 

It is also to be noted that, within the group of four predictors, little is gained 
by differential weighting according to the betas. If all weights are taken arbitrarily 
as 1.00, then, by the procedure outlined above (summing the validities and dividing 
by the square root of the sum of the entries in the predictor matrix), the overall 
validity is again .35. 

Within an observed sample, it is not difficult to maximize, or approximately 
to maximize, the correlation of a weighted sum of predictors with a criterion. 
However, for such maximized coefficients to be stable from sample to sample, 
the correlations on which they are based must be reasonably stable. 


It should not be supposed, however, that when л’ variables are selected 
by this method from a total pool of n predictors that the n’ so chosen 


Є 
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necessarily constitute the best team of n’ predictors for estimating criterion 
values. The team of n’ predictors will tend to be the best combination, but 
theoretically at least, one might start with a predictor that does not have 
the highest correlation with the criterion and select a still more valid com- 
bination. In fact, the best single predictor may not enter the most valid 
combination at all. To select the most valid team of n’ variables from a pool 
of п predictors, there seems to be no direct solution other than the compari- 
son of the validities of combinations of the variables, taking n’ at a time. 
However, the procedure of selecting a highly valid group of predictors, 
beginning with the most valid, generally leads to a reasonably good 
solution. 


SUPPRESSOR VARIABLES AND NEGATIVE BETAS 


Occasionally in multiple correlation one encounters a “suppressor” 
variable. One example is a variable that has zero correlation with a cri- 
terion, but a high correlation with a valid predictor. Such a variable in- 
creases the multiple by “suppressing” some of the invalid variance of the 
valid predictor. 

Consider a criterion that is relatively “pure,” for example, a criterion 
comprising mostly mechanical ability. Suppose a predictor was “mixed,” 
requiring both verbal and mechanical abilities. The presence of the verbal 
admixture could be thought of as lowering the correlation from what it 
Would have been had a predictor measuring only mechanical ability been 
available. 

A “pure” verbal test might have a high correlation with the “mixed” 
predictor, but a zero or near-zero correlation with the criterion. 

If the multiple correlation under such circumstances is computed, it is 
found that the addition of the invalid test actually adds to the prediction 
beyond the validity of the “mixed” test. The beta coefficient of the invalid 
measure is necessarily negative. 

Variables that have negative betas in a regression equation are, to some 
degree at least, suppressors. This phenomenon is of interest when a team of 
highly valid predictors is being selected. A variable could have a zero 
Covariance with a criterion at a given stage, but still add to the multiple 
later on. This would be true if it had a definite degree of relationship with 
one or more other predictors that were still valid. When a variable has 
neither validity nor a definite relationship with a valid predictor, it can add 
nothing to the prediction of the criterion. 


SHRINKAGE OF MULTIPLE R 
A multiple correlation is computed for a particular sample of cases. Each 


of the predictors is so weighted that the correlation between the weighted 
sum of the predictors and the criterion is as high as possible. 
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In finding the betas, observed correlations based upon the observed 
values of the variables are used. The weighting proceeds as though all 
variability were true variance; that is, the random error in each variable as 
well as the true component is weighted so as to make the correlation with 
the criterion as high as possible. 

In a subsequent sample (or in the population), one would expect the 
random error to be differently disposed. Accordingly, if the beta weights 
found in one sample are applied to the same predictors in another sample, 
it is to be expected that the correlation between the weighted sum of the 
predictor and the criterion will decrease. This phenomenon is generally 
observed and is known as the shrinkage of the multiple. 

A formula? for estimating the correlation in the population between the 
criterion and the weighted sum of the predictors is 


N-n 


in which R’ is the shrunken multiple, N is the number of cases, and n' is 
the number of predictor variables selected at any stage. 

It will be noted that the shrinkage decreases as N increases, but increases 
when the number of predictor variables becomes large. The formula in- 
dicates what is to be expected when maximizing procedures are used in one 
sample of cases and the weights are then applied in a new sample. Е 

Many personnel psychologists handle this problem by cross-validation. 
A procedure involving maximization, whether by working out regression 
weights or by selecting highly valid combinations of test items, is not con- 
sidered to yield the true correlation with the criterion. Weights obtained 
are applied to a subsequent sample of cases, and the validity found therein 


is taken as a better indication of the true validity. 


CONSIDERATIONS IN UNDERSTANDING MULTIPLE А 


If the predictors more or less duplicate one another (as evidenced by high 
intercorrelations), the multiple will tend to be a little higher than the 
validity of the most valid predictor. If there are a number of predictors 
and they are more or less independent (as evidenced by intercorrelations 
approximating zero), the multiple will be considerably higher than the 
validity of the best predictor. On the assumption that all validities of n 
Predictors (indicated as ғо) are equal to one another, and that all inter- 
correlations (indicated as r;;) аге also equal to one another, but the validities 
and intercorrelations are not necessarily equal, one can find the effect of 


D 


5 Developed by Wherry (7). 
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different sets of validities on different sets of intercorrelations from the 


following formula: 
/ п 
= ЖЕНЕ, жаннын 7.10 
келщ 1+ (п – І)ғ; ead 


Under the assumptions stated above, this is an exact formula for mul- 
tiple R. For example, for n predictors of equal validity and intercorrelations 
of .00, the multiple is ,/л times the validity of a single predictor. If allinter- 
correlations are 1.00, one predictor is as good as the whole team. Consider 
also a specific case of 16 predictors, with all intercorrelations .20 and all 
validities .25. Multiple R would be .50. Sometimes, when conditions are 
met reasonably well, Formula 7.10 can be used to estimate the multiple, 
but the more validities or intercorrelations depart from equality, the less 
useful the formula is for this purpose. 

A final word—it should be apparent that the multiple can never be less 
than the absolute value of the highest validity. If only one variable is useful 
in predicting the criterion, it alone will be weighted, and the weights of the 
other predictors will be zero. 


SUMMARY 


A coefficient of multiple correlation represents the relationship between a 
single variable on the one hand and a weighted combination of variables 
on the other, the weights being determined in such a manner that the cor- 
relation is a maximum in the sample in which it is computed. 

Such correlations are subject to *shrinkage" in subsequent samples. 
Consequently, no computed multiple R should be regarded as descriptive 
of a generally obtaining relationship. When the weights obtained in one 
sample are applied to the predictors in a subsequent sample, the correlation 
between the criterion and the weighted composite is a better indication of 
the true relationship than the original multiple R. 

Beta coefficients are weights applicable to z-scores in finding a weighted 
composite. 

Variances of residual variables (partial variances) and covariances be- 
tween pairs of residual variables (partial covariances) are intermediate 
statistics useful in finding a multiple R and the related betas. 

Intercorrelations remaining the same, multiple R increases as validities 
increase. Validities remaining the same, multiple R increases as intercor- 
relations decrease. In other words, the less predictor variables overlap 
among themselves and the more they overlap the criterion, the better the 
criterion can be forecast. 
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EXERCISES 


1. The following artificial variables are expressed as z-scores: 


CASE 22 21 20 

А.А 1.50 1.00 1.50 
B.B 1.00 —1.50 00 
C.C. .50 00 1.00 
D.D. 00 50 —.50 
E.E. —.50 —.50 — 1.00 
F.F. —1.00 1.50 .50 
G.G. —1.50 — 1.00 —1.50 


(a) Compute the matrix of intercorrelations, remembering that rz, -«Хг,2,/М. 

(b) Find Ко», and related betas. 

(c) By means of the regression equation, Zo = Bo1.221 + Boo.122) find the seven 
values of z,. 

(d) Find the variance of Z,, which should be R$» 

(е) Find the correlation between 2 and Zp, which should be Ко). 

2. Let three predictor variables be X, Y, and W. Let the criterion be X. Using 
appropriate subscripts for all variances, covariances, and betas, write alge- 
braically the successive matrices that can be used to find the partial variance 
of the criterion after variance predictable from X, Y, and Whas been removed. 
In the same notation, write a formula for multiple R, computing formulas 
for final-order betas and the regression equations in z form. 

3. Using Formula 7.10, R = њеУл/0 + (т — Dry) which applies when all 
validities (го) are equal and when all intercorrelations (rj) are also equal, 
construct a table of R for cases in which л, the number of predictors, is 4. 
If the formula yields a value of R that is imaginary or greater than 1.00, the 
Corresponding constellation of correlations is impossible, and an appro- 
priate indication should be entered in the table. The following format is 


Suggested: 
TABLE OF R FOR FOUR PREDICTORS WITH EQUAL INTERCORRELATIONS (гу) 
EQUAL 


VALIDITIES 
(rot) .80 


INTER CORRELATIONS 
60 —.40 —.20 —.00 —.20 40 .60 .80 
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4. An aptitude test of 50 items has a mean item-item correlation of .30. The 
mean of the item validities against an external criterion is A5. 

(a) If it is assumed that all item-item correlations are equal to their mean of 
-30 and that all item validities are equal to their mean of .15, what is the 
validity of the complete instrument? 

(b) What would be the estimated validity under the same assumptions if the 
test were reduced to 25 items? 

5. Use the sum variable as a check while finding multiple R for the following 
matrix of correlations from Breimeier (1). Also compute and check the final 
order betas. 


VARIABLES 2 2 1 0 
4 АСЕ psychological examination 49 05 35 21 
3 Strong vocational interest (ministry) 36 42 ll 
2 Kuder preference (musical) E .19 
1 Kuder preference (literary) 12 
0 Grade point average 


(theological school) 


6. The following hypothetical example is designed to show how a suppressor 
variable might function: 


1 0 
2 Mechanical aptitude .707 .350 
1 Verbal aptitude -000 


0 Mechanical occupation 


4 With items written in verbal context. 


(a) How much does a nonvalid test (verbal aptitude) add to the prediction of 
the valid test (mechanical aptitude)? 

(b) Is this situation reasonable ? 

7. (a) In sample A in the following data from Taylor and Tajen (6), apply the 

Wherry-Doolittle method to find a highly valid subset of three predictors. 

(b) For these three predictors, find R and related betas. 

(c) Apply these betas to the sample B and find the correlation between the 
three variables so weighted and the criterion. Has there been shrinkage? 

(d) For the same three predictors in sample B, compare the correlation so 
found with R. 


Sample A: N — 96 


5 Clerical speed .48 .61 22 46 44 
4 Word meaning .65 21 .56 .48 
3 Arithmetic :04 -50 .53 
2 Figure cancellation 14 .06 
1 Figure classification 292 
0 Grade іп course 
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Sample B: N= 97 


$ 45 49 35 41 .43 
4 55 32 48 38 
3 29 38 .36 
2 30 24 
1 48 


8. Below is the matrix of intercorrelations for the 11 variables in the cross- 
validation sample in the study described in Example 7.3 (5). N = 112. 


9 8 7 6 5 4 2 2 1 0 


10 25 32 27 48 14 36 25 32 29 2 
9 6 26 42 21 50 50 30 .53 22 
8 32 22 1 26 60 24 26 41 
7 28 44 зз 11 29 .51 2.34 
6 з 43 19 50 35 .3 
5 53 04 .51 .60 20 
4 17 .81 .83 46 
3 20 12 24 
2 3.05 
1 28 


By the Wherry-Doolittle procedure illustrated in Example 7.3, find a highly 
valid team of four predictors in this sample and cross-validate the multiple 
in the original sample. 
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INDEPENDENT AND DEPENDENT VARIABLES 


Many Psychological studies can be understood as an attempt to ascertain 
the relationship between two variables. Generally, one of the two can be 
identified as the “independent” variable; the other, as the “dependent” 
variable. 

Ina laboratory experiment, there may be a series of trials (or two or more 
groups of individuals) for each of which there is a different degree of an 
independent variable, which is usually under the direct control of the in- 
vestigator. The experimenter may be interested in discovering any sys- 
tematic variation in the dependent variable corresponding to changes he 
makes in the independent variable. 

Thus, in a study of factors influencing the perception of a shape, the 
experimenter may vary area as an independent variable and determine the 
degree to which correctness of identification (the dependent variable) is 
related to variation in the independent variable. 

He may do this for a sample of Shapes, using a single observer, or he 
may carry out the study with a group of Observers, exposing each to a 
different variant of the stimulus (that is, the independent variable). More 
than likely, however, he would employ a combined approach, using varying 
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degrees of the independent variable with a number of subjects, and avera- 
ging results in order that the general trend might emerge more clearly and 
reliably. 

If both variables can be conveniently scaled, results of such a study can 
be presented in the form of a graph. The vertical axis of pair of coordinates 
can be laid off to represent varying degrees of the dependent variable, say 
in percent of correct recognitions, while the horizontal axis may represent 
the varying degrees of the area of the stimulus. 

An alternative procedure is to find a mathematical function to represent 
the relationship. Simple mathematical functions, which seem to be most 
appropriate for psychological data, may be graphed as straight lines or 
smooth curves. When raw data are plotted, irregularities in trend lines are 
generally taken as representing sampling or observational errors. 

In Chapter 6 the fitting of a straight line connecting two variables by the 
method of least squares was described. To define a straight line precisely, 


relative to a pair of coordinate axes, two constants are needed: the slope 


and the intercept. In the z form of the regression equation connecting two 
nd the intercept is zero. In the raw- 


variables, 20 and л, the slope is го: а 
score form of the regression equation, the slope is the regression coefficient 
term, [Mo— (коло М1)/81]. Ву 


(ro159)/5, and the intercept is the constant 
ther curve can be established 


somewhat similar principles а parabola or o 
to represent the relationship between two variables. Procedures for fitting 
curves are presented in advanced texts in statistics, such as Lewis (3). 

An important fact in all types of experiments involving human beings is 
that in addition to the independent variable, there are almost always various 
other variables that may, directly or indirectly, affect variation within the 
dependent variable. In the study of the perception of shape, such factors, in 
addition to size, include configuration, color, and illumination as well as 


variables within the observers, such as familiarity, visual skills, and intelli- 
gence. 


CONTROL OF EXTRANEOUS VARIABLES 

Any variable not of direct interest in an investigation, but which may affect 
the results, may be tagged as an “extraneous” variable. * Control" refers 
to methods of eliminating or at least reducing the influence of such extra- 
neous variables. There are five methods of experimental control through 
modification of the situation or through selection of cases. In addition to 
these five experimental methods, there is statistical control, the, main topic 
of this chapter. The present discussion of experimental control emphasizes 
the control of a single variable in an experiment. Normally, however, 
attempts are made to control two or more variables simultaneously, and 
different control methods may be used in the same study. The five methods 


are: 
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= 


Eliminating a variable completely; 

2. Eliminating a variable by selecting or modifying cases so that all have a 
uniform degree of the characteristic; 

3. Matching cases and distributing them into two or more groups accord- 
ing to values on a variable; 

4. Balancing cases; and 

5. Randomization in the assignment of cases to groups. 


COMPLETE ELIMINATION AS A CONTROL 


When a variable is controlled through elimination, it means that within 
the experimental situation there is complete uniformity with respect to the 
characteristic in question. 

By eliminating a variable completely, the degree of a characteristic is 
reduced to zero in all instances. Thus, if one were studying the relationship 
of different concentrations of table salt to the speed of recognizing a 
watery solution as salty, one would ordinarily use the purest salt and the 
purest water obtainable, thus eliminating completely the effect of other 
chemicals that would ordinarily be in the salt or in the water. 

Elimination by reducing the variable to zero in all cases is more practical 
in psychological studies when the variable characterizes situations than 
when it describes people. In a series of situations there can be zero degrees 
of light or noise or vibration; and, in experiments on taste and smell, zero 
amounts of specified chemical elements. On the other hand, temperature 
of a solution can be eliminated as a variable only by using a constant 
temperature. In a study in which it is necessary to control variation in 
temperature, a certain degree of heat (say, 18°C) may be specified. 


SELECTION OF CASES OF A UNIFORM DEGREE OF A CHARACTERISTIC 


This is a second method of control, in which a variable is eliminated by 
selecting only instances that have a uniform degree of the characteristic to 
be controlled. . 

. This type of control is feasible with attributes of individuals, Theoretically 
it is possible to conduct a study in which all subjects are of the same 
chronological age, the same Т.О., or same achievement in reading. How- 
ever, there are two difficulties: one practical, the other theoretical. 

While in some instances it is relatively easy to find numerous children 
who are, say, within two months of their tenth birthday, it is very difficult 
to find substantial numbers that are within, say, two days of their tenth 
birthday. Instead of uniformity in respect to age or LQ., or reading 
achievement, the best that can be expected in practice is a narrow range of 
the variable. This may reduce its effect considerably, but does not com- 
pletely eliminate it. 
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On the theoretical side, the variables on which it is feasible to control 
may not be the variables that need to be controlled. If one controls on sex, 
by using only boys in a study, there still may be a wide range of variation in 
variables related to sex, such as interests and physical characteristics, and 
which may also be related to the experimental variables. 


MATCHING OF CASES 


The third method of control is by matching precisely the cases assigned to 
two or more groups. If there are two groups, corresponding to two degrees 
of the experimental variable, cases that are identical on the variable to be 
controlled are paired. One member of the pair is assigned at random to 
each of the groups. If there are three or more groups, then clusters of three 
or more cases are located, all identical on the variable to be controlled. 

Consider, for example, a study of the effect of different incentives on 
learning. Normally there would be at least one “experimental” group and 
one “control” group. In the experimental group a special incentive might 
be introduced, whereas in the control group there would be no special 
Motivation. As an alternative, there might be several groups, each repre- 
senting a different degree of the experimental variable. 

Obviously, if the groups vary at the start of the study in some way that 
is related to the variables under scrutiny, it will be difficult or impossible 
to draw valid conclusions. 

Suppose there are three groups in the study, two experimental groups 
and a single control group, with the experimental variable in three degrees 
and varying with group membership. To control a variable such as reading 
comprehension, steps might be as follows: 


1. Measurement of the reading comprehension of all individuals who 


might be used in the experiment; 
2. Arrangement of all individuals in 
3. Selection of sets of three cases wit 
4. Assignment of the three cases in each set to 
This may be accomplished by devices suc 
random numbers. 


order on this variable; 

h more or less identical scores; and 

the three groups at random. 
h as dice or by a table of 


Matching of subjects is relatively easy on a single control variable, when 
assignments are made to a limited number of groups, and when there are 


relatively large numbers of subjects from which to choose. Matching be- 
Comes progressively more difficult as more groups are involved, as an 
attempt is made to match on two or more variables simultaneously, and 
when the pool of potential subjects is limited. When feasible, matching is an 
excellent method of control in that nonlinear as well as linear correlation 
between any matched variable and the independent variable becomes zero. 
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BALANCING OF CASES 


When the pool of potential subjects is limited, or when it is desired to use 
more than two experimental groups, or when an attempt is made to control 
on two or more variables simultaneously, balancing of cases is more 
flexible than matching. In balancing, subjects are assigned so that the mean 
of each controlled variable is identical in each of the groups. 

There is no exact formula for balancing. One procedure is to tally cases 
by groups as they are assigned and note whether the means remain more 
or less identical. Of course, before assignments are considered final, iden- 
tity of group means should be demonstrated. 

In addition to balancing on mean values, it is generally desirable in 
making the assignments to balance so that variances also become equal. 
This can be accomplished by noting the dispersions of the distributions 
during the process of assignment. 

In actual experimentation, how much departure from equality of means 
and variance can be tolerated is a matter of judgment. Clearly, no group 
differences in either means or variances should be significant by statistical 
tests discussed in Chapter 13. Obviously, the closer the group means and 
variances are to being identical, the better the job that has been done in 
controlling the variable. 

When variables are controlled by balancing, linear correlation between 
any controlled variable and any experimental variable is zero. However, it 
is conceivable that even with careful balancing, some sort of nonlinear 
relationship may exist between a controlled and experimental variable. 


RANDOMIZATION OF CASES 


A fifth method of control is to assign cases to groups by chance. Various 
systems can be used for this purpose, such as tossing coins or drawing lots. 
Probably the most satisfactory way is through the use of a table of random 
numbers and a predetermined formula. 

Consider 200 subjects to be assigned at random to four groups, that is, 
50 cases to each group. The first step would be to number the cases in any 
convenient way from 001 to 200. The second step would be to establish a 
rule for assignment to groups, such as using the last two digits of every fifth 
number in a table of random numbers, and assigning group membership 
on the basis of the remainder when these two digits are divided by four. A 
remainder of 1 might mean assignment to group 1; a remainder of 2, 
assignment to group 2; a remainder of 3, assignment to group 3; and re- 
mainder of zero, assignment to group 4. Each case would be studied in 
order, and the corresponding random number would be consulted in 
making the disposition of the case. In all probability the four groups would 
not be filled simultaneously. Suppose group 1 were filled first, with 50 cases. 
Thereafter, numbers with a remainder of 1 would be disregarded. The rule 
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would then be to go on to the next random number until a second group 
was filled, and so on until 50 cases were allotted to each of the four groups. 
The use of a table of random numbers is shown in Example 8.1. 


EXAMPLE 81- 


CONSTRUCTION AND USE OF A TABLE OF RANDOM DIGITS 


Purpose. In psychological research, tables of random numbers are useful in 
replacing human judgments with a method of decision that lacks the possibility 
of systematic bias. | 

In many cases the mathematical model used in making inferences assumes that 
the sample being studied has been drawn at random from an unlimited popula- 
tion. The use of a table of random numbers to reduce bias in selecting the sample 


makes the mathematical model more appropriate. | 
In other cases, where a total sample is to be divided into subgroups, a table of 


random numbers may be useful. A useful method of controlling a variable in л 
Subgroups is to: 


1. Rank all cases on the variable; 
2. Divide (һе N cases in order of rank into subsets of п each; . 
3. Use a table of random numbers in deciding which case of each subset is 


to be assigned to each of the л subgroups. 


Construction of a Table of Random Numbers. Random mul 
Simply aggregations of random digits. In the decimal system, : i 
random digits would consist of equal proportions of the ten digits with equal 
likelihood of any digit being drawn at any particular time. A 

Various mechanical devices can be used as sources of random digits, one of 
them being described in the volume by the Rand Corporation (5) from which the 
2500 digits of Table 8.1 were selected by chance. . 

The du as to where to enter a table of random numbers is generally made 
on the basis of some sort of a mechanical lottery, such as tossing one or more 


Coins or using dice. ; 
In the present instance, after developing а general plan that called for selecting 
2500 of the 1 million digits of the Rand table, it was decided to insert a pointer 
blindly into the table to locate a key line of random numbers (ten blocks of five 
digits each). Other prior decisions were: А 
1. Тһе io dah Gat was а 1, 2, 3, or 4 encountered in the first block would 
eloping the sample of digits. 


determine the series of fourth pages used in dev g 
If a 1 was first encountered, then the series of pages would be 1, 5, 9... ; if 2, 


tidigit numbers are 
a universe of 


then th y 10---; and so on. 8 
е series would be 2, 6, Accordingly, it was decided 


2. In the Rand tables there are 50 lines on a page. ae 1 
that the first digit in the second block would determine the tens digit of the line 


to be selected on each fourth page, 0 being paired with 6, 1 with 7, 2 with 8, 


and so on. р 
3. It was decided that the initial digit of the third block would be taken аз 


the units digit of the designating line. 


SLI9C 
liter 
IILtp 
28628 
8E9ST 


OLIE9 
8PLLL 
1#6Ё8 
тс 
019796 


89LSL 
LOC69 
T168P 
76555 
BOTSE 


6TIEE 
08996 
76976 
9vSLv 
vS0SC 


vL6t6 
£I0cs 
TOSET 
23543 
9 © 


61591 
£SOP8 
67816 
1498с 
1150 


С98Р5 
5028 
TISSI 
04/29 
17656 


£9609 
O9L8E 
TIEZS 
ÞOTEL 
0ғ129 


15009 
POOET 
0L690 
ӮВЕП 
ЄЕРОР 


72510 
8SLE6 
69pE9 
52960 
LItOS 


65611 
РЕВСР 
OS66E 
90L00 
82691 


£6.86 
ІРІР0 
9E60L 
ssoss 
16581 


56660 
£S6L6 
144133 
SEPOL 
SSLOT 


0476 
61592 
14568 
ccrer 
66159 


TETTO 
6crL6 
9886L 
£0L70 
90868 


681.25 
SILTI 
8SSLI 
8ELS8 
50660 


ЄРА8Т 
v£8IS 
$191 
9785 
91468 


[дү 
8078p 
24324 
68509 
Lco9r 


77968 
6S86L 
ScIcr 
(1887 
£00£t 


61419 
8890€ 
ЕС 
989/0 
£0ctc 


LELTE 
87809 
L8TII 
L8EIS 
T6618 


9SO0E 
Icrcó 
L£098 
9v80L 
tbLOC 


П?Е? 
?105< 
91219 
68L9L 
?8101 


SISPI 
1581/ 
66555 
56919 
50667 


£98t8 
80Z0E 
90£9t 
84581. 
91159 


888ES 

STOSS 
есп 
LEOL 
89РРС 


РЕРТО 
97/89 
P886E 
ISS9L 
(43243 


66917 
SEES 
980ct 
1С©8$ 
©08$9 


©1069 
0986 
SLC80 
10686 
OL6E8 


STILE 
6606? 
t0997 
T6L6L 
£€h00 


vO6LC 
VILIS 
сил 
SETHE 
TOLCO 


50861 
18156 
88/8 
81588 
9ғ056 


Tris 
LO8TT 
91005 
ШЕ 
SOELT 


8906? 
?6Е8< 
19801 
8800 
58909 


4911. 
ШРІР 
OPOLL 
406 
#519 


vtv6c 
еше 
TSPIL 
02816 
161 


pele 
0є9/ў 
79996 
TC96€ 
І8Е6І 


52808 
ПРО 
91589 
9coro 
10/19 


[4740] 
ISTP 
8РСРС 
52096 
ІРбер 


92708 
50866 
10S€0 
ІСІРІ 
20900 


80<<0 
68088 
81991 
c6Svl 
85175 


SOL6T 
19961 
S8PlE 
610cL 
18119 


0и 
8798L 
6426С 
19С<6 
5679 


L6T6T 
IS8pE 
06/56 
89185 
80566 


15424 
(50 
69981 
29999 
19661 


OtLo9v 
bises 
5688С 
42447 
TEHLE 


12351 
SLI9Z 
61€0I 
SOLED 
06558 


ТОЕ 
618ЕС 
£6669 
?ЕЕ66 
15698 


РОР9Р 
£9021 
SEOEL 
58098 
<80с/. 


ЄРЄ89 
99298 
(4743) 
65664. 
15070 


S1I9IG WOGNVH 0052 30 318У1 V 'L'8 31891 


90705 
16859 
66919 
6980S 
06700 


96667 
9115 
1262 
85510 
АДА 


L86bL 
vcOVL 
21589 
ГАЧ ГА 
ЕОРТО 


07808 
50661 
ЄСЄТ 
£OrLc 
[224 


04/97 
99701 
v96tr 
96LLE 
CLOEE 


25505 
TS99L 
£0S61 
LLLIZ 
TT6TI 


Tc8v8 
£816L 
L9989 
1917 
pOdEL 


6Е810 
ШИ 
89159 
615%8 
16819 


(434274 
5109 
1751763 
819ғС 
SLVLV 


S6cOT 
6Е60с 
54105 
09558 
РЕСЕ 


60$89 
0296 
11061 
10211. 
06СРӮ 


10118 
96618 
TEPES 
vLELO 
98/1. 


64841 
153243 
9920 
Iplt 
1657 


61” 
1995 
£v00L 
0c96t 
9РЕРЕ 


68896 
LISHI 
O6SIT 
vCE66 
£IOtr 


£8666 
9vcvL 
беп 
езі 
(44322 


11069 
(6987 
84156 
cctv6 
£8696 


(5586 
81050 
6699 
59619 
343 


Lv6LV 
61119 
(4270 
22778 
89016 


10118 
TESSI 
t109 
90179 
91? 


81586 
©6991 
698/9 
18419 
72189 


659 
99929 
6РЕЕІ 
0ғ256 
15958 


%5/09 
11956 
TOLLO 
068tL 
є8/81 


$6199 
10668 
9110 
ETEL6 
РЕбР9 


75966 
89506 
STPLL 
(9568 
££LLC 


LOEOZ 
75543 
1160 
£8pcv 
86L09 


є110$ 
0єє/8 
S6cs9 
SOEOE 
SOETL 


9Lv8V 
15905 
18895 
81565 
80091 


06008 
ЄЕС68 
869tt 
ШАШ 
14966 


L88tt 
$0010 
LOVEE 
S9LIS 
$886c 


TOSTE 
19/56 
82101 
РЕТ9С 
10015 


Seley 
96E6E 
88096 
04671 
20918 


10159 
58669 
045% 
09908 
[2414 


01<(6 
59950 
07256 
161% 
£9701 


£7988 
1662 
с8с0 
+991 
80612 


1851. 
©1908 
8100 
£c0S6 
1168 


19568 
РРЕ98 
91156 
[40744 
SOIT 


66166 
АА 
89918 
68692 
07189 


74437 
BESTE 
vCc06 
L9ccv 
O6ESL 


123334 
1616 
S9£9T 
18268 
©8891 


190. 
64.66 
L£OvV 
РЕ6ІТ 
vv819 


86908 
65915 
58576 
Sy9vc 
v8L66 


РО9РЕ 
6S6SE 
132433 
59566 
64586 


18561 
91719 
6Lcv8 
00585 
$8879 


SOIST 
616£9 
58691 
58508 
11801 


10680 
98018 
10916 
TS9ET 
99211 


19196 
8682 
99707 
£6LT 
(6886 


$7706 
loce. 
LLELO 
85710 
142333 


98660 
1979p 
10689 
69696 
18/1 


17081 
Ө%9ІР 
PrLoOL 
РЕР 
SEEOL 
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4. Since only 25 digits were to come from each selected line of 50 digits, it was 
decided that if the first digit in the fourth block were odd, the odd sequence 
1, 3, 5 +++ would be used; if it were even, then the even sequence. 

Prior to entering the table with a pointer, a toss of a coin was used in deciding 
whether the pages to be inspected would be odd or even. The decision was for 
an even page, and the book entered accordingly. 

Entry into the table resulted in automatic decisions: 


- To use pages with page numbers divisible by four without remainder; 

- That the tens digit of the line number would be 8; 

- That the units digit of the line number would be 2; and 

- That the even sequence of numbers on lines designated as xx82 would 
constitute the table. 


жы 


Following a plan determined by chance, the pool of 2500 digits reported as 
Table 8.1 was selected. 

Testing a Table of Random Digits. Tests to determine whether a table of digits 
really is random can be of many varieties, but all would resemble those to be 
described in Chapter 13. 

Testing the Diversity Between Observation and Hypothesis. If a table is random, 
Conditions such as the following would be true within sampling error: 


1. Intercorrelations of rows and intercorrelations of columns would be .00. 

2. Distributions of the ten digits for the table as a whole and for larger sub- 
divisions would be rectangular. 

3. Sequences and particular number combinations would occur no more and 
no less frequently than expected by chance. 

4. The distribution of identical digits (pairs, triplets, and the like) within 
Successive groups of n digits would be predictable on the basis of pro- 
bability theory, 


Using a Table of Random Digits. In psychological research one generally uses 
Published tables Such as the Rand table. The use of a table of random digits 
was illustrated in the decisions made in constructing Table 8.1. Within the frame- 
Work of an investigation, the decisions that are to be free of the bias of the experi- 
menter are catalogued, and then chance is involved as a basis for these decisions. 
Rules for applying the devices, such as coins or dice or roulette wheels or random 
numbers, are established arbitrarily in advance; then the chance devices are 
followed in making the decisions. The range of possible applications is vast. 


As a method of control, randomization makes for easy decisions on the 
part of the experimenter. It also circumvents any prejudices he may have 
which might affect assignment to groups. Theoretically, the correlations 
between extraneous variables and the experimental variable associated 
with group membership is of the magnitude to be expected by chance. 

When methods exist for the measurement of variables to be controlled, 
randomization is hardly to be recommended: complete elimination, elimi- 
nation by the use of a constant value, matching and balancing are all much 
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better. However, after one or more variables are controlled in arranging the 
cases in sets, then a chance procedure for final assignment to groups has 


much to recommend it. 


CONTROL THROUGH FORMING RESIDUAL VARIABLES 


As noted in the preceding chapter, any variable measured over N cases (on 
a scale for which linear correlation is an appropriate statistic) can be 
* residualized ” with respect to any other variable or any group of variables 
measured over the same N cases. Residualization greatly resembles the 
scribed above. It results in variables that are linearly 


methods of control de 
and thus eliminates 


uncorrelated with one or more other variables, 
variance associated with these “ outside” variables. 

Ordinarily, residual variables are not used in making assignments to 
groups; rather we partial out one or more variables to be controlled from 
the independent variable, the dependent variable, or both. This is usually 
effected by computing correlations in which one or both of the two vari- 
ables entering into the correlation have been modified so that all values are 
residuals with respect to one or more outside or “secondary” variables. 

In generalized notation let z; jx... refer to a variable in higher-order 
2 form; that is, with mean of zero, but variance less than unity. Variable i 
is a primary, observed variable. In effect, scores in variable i have been 
divided into two portions: Ž; which is predictable from z; and perfectly 


correlated with z,; and z; з, which is uncorrelated with z;. In a two-variable 


zj two-v 
scatter diagram, the Ž;; values аге all exactly on the regression line, the 
the z; у values are residuals, that is, 


least squares line of best fit, while all i 
distances between the observed values and the regression line. 


After the z; values have been modified to form the z; ; values, the latter 
сап be “residualized” again with respect to 2. ;- This results in a new 
hc is uncorrelated with 


variable, тыра second-order residual variable, whi 1 wi 
both variable j and variable k and which comprises the variance remaining 
in variable i after the variance predictable from the best weighted team of 


variables j and k has been subtracted out. The numerical value of each case 
can be expressed as the original value less a portion predicted through the 


use of the regular regression equation: 
тык = 21 о = Zi Вв Bij? 
il an observed variable has been made 


The process can be continued unt ў 
dary variables, as indi- 


uncorrelated with respect to any number of secon 
cated by the general notation z; jx...» 

While the individual values may som 
of residuals is of more importance in unr 
among observed variables. 


times be of interest, the correlation 
aveling complex relationships 
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RATIOS AS RESIDUAL VARIABLES 


Actually, a residual variable falls into the same statistical domain as a 
ratio or quotient. If a ratio, such as an intelligence quotient, has been pro- 
perly constructed, it will correlate zero with the denominator variable. 
This is necessarily true in the original group in which the system of quotients 
has been developed, provided the relationship between the numerator and 
denominator variables is linear, and provided the standard deviation of 
the numerator variable is greater than the standard deviation of the de- 
nominator variable by a factor of I/r. The intent of the intelligence quotient, 
in which mental age is the numerator variable and chronological age is the 
denominator variable, is to obtain a measure of intelligence that has been 
Purified by subtracting from the original measure that portion which is 
predictable from a knowledge of chronological age alone. If an intelligence 
quotient is constructed so that it correlates zero with chronological age, 
it will tend to be constant as the child grows older, which is one of its 
desirable characteristics. Although numerically quite different, a set of 
1.Q.’s for a standardization group, in which the regression of chronological 
age on mental age is linear, will correlate more or less perfectly with a set 
of residuals formed by subtracting from the mental age the portion pre- 
dictable from chronological age. 

In order to construct a set of ratios that will correlate zero with the de- 
nominator variable (designated as Y), both variables can be converted to 

andard scores with identical and positive means (such as as 50 or 100) 
and with the standard deviation of the numerator variable Y, equal to 
5,/r,. Such a set of ratios will correlate almost perfectly with the residual, 
25у — Z, — l'z,. Тһе equivalence of ratios and residuals is demonstrated 
in Example 8.2. 


EXAMPLE 8.2 
EQUIVALENCE OF RATIOS AND RESIDUALS 


Two variables (N = 5) in z form, 2, and z,, are shown below. X’ is a linear 
Conversion of z, and Y is a linear conversion of z,. X’/¥ is the ratio formed 
from these converted Scores, multiplied by 100 and rounded to the nearest integer. 
Also shown аге 2, as predicted from z,, and the residual variable Zee 


2. — 2. 

s 2, xe ¥ ZONE 2% Or 2,.у 
А.А 1.50 50 65 53 123 30 1.20 
B.B 50 1.50 ss 59 93 90 —.40 
Ge 00 00 50 50 100 00 00 
D.D 7-50 —150 45 41 110 —90 40 
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By computing Xz,z,/N, it can be seen that r,, —.60. X^ and Y have been 
formed with identical means. However, a standard deviation of 6 has been 
assigned to Y, and of (s,/r., = 6/.60 = 10) to X". The result is that X"/Y cor- 
relates approximately .00 with Y. (The actual correlation is —.012.) 

The correlation of X’/Y and the residual z, „ is .997, so that as a variable, 
X’/Y has statistical properties almost identical with those of 2, ,. 

In practical instances, the residual is probably to be preferred to a ratio, since 
it correlates precisely .00 with the eliminated variable. It is also a simple matter 
to transform a set of residuals to standard scores with any desired M’ and s’. 


PARTIAL CORRELATION 


As a descriptive statistic, a partial correlation is the Pearson product- 
moment correlation between two residual variables, both of which have 
been “residualized” with respect to an identical set of secondary variables. 
A partial correlation may be used to estimate what the correlation between 
the two primary variables would be in groups homogeneous with respect 
to the secondary variables. Suppose, for example, there is a series of groups, 
none of which shows variability in a third variable, but which collectively 
exhibit the entire range of the third variable. It would be expected that the 
partial correlation between the two primary variables, with the third par- 
tialed out (that is, the correlation between the two primary variables after 
variance predictable from the third variable has been removed), would be 
representative of the series of correlations between the two primary 
variables in the several groups homogeneous with respect to the third 
Variable. The partial correlation could then be considered as a measure of 
the relationship between the two primary variables freed of the influence 
of the secondary variable. | 

As an example, consider the relationship between а reading comprehen- 
sion test and an intelligence test in grade school children. Scores on both 
tests increase with age. For a group of pupils of varying age, the correla- 
tion between reading and intelligence will be spuriously high because of 
the extraneous variable (namely, age) that is related to both. If the correla- 
tion is obtained between reading and intelligence scores (modified so that 
as residual variables they are uncorrelated with age), the partial r 50 ob- 
tained can be taken as an estimate of the correlation between obtained 
Teading and intelligence scores in groups homogeneous with respect to 
age. 
The experimental approach to this problem parallels the statistical. In 
Several groups of pupils, each of which is homogeneous as to age, the 
typical correlation in these groups should be substantially the same as that 
obtained by partial correlation over all groups. A comparison of partial 
correlation and zero-order correlation within groups is presented in 


Example 8.3. 
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EXAMPLE 8.3 


COMPARISON OF PARTIAL r WITH r WITHIN GROUPS 


Purpose of Partial r. A partial r may be considered an estimate of the cor- 
relation between two variables in groups homogeneous with respect to the 
variable(s) **partialed out." As a descriptive statistic, it is the Pearson product- 
moment r between two sets of residuals, both of which, by the fact of being 
residual variables, are uncorrelated with the variable(s) partialed out. 

r Within Groups. In Table 8.2 are displayed nine scatter diagrams for nine 
groups homogeneous with respect to the pilot aptitude score (pilot stanine) for 
3000 Air Force cadets (1). The correlations, in descending order, between bom- 
bardier and navigator stanine are shown in the table. 


STANINE 
GROUP N "ву 

7 359 -688 

8 218 .641 

2 204 .636 

9 224 .632 

3 295 .627 (median) 
5 536 :614 

6 511 -614 

4 571 572 

1 82 546 

TOTAL 3000 


The r for the Total Group. In Table 8.3 are the three scatter diagrams for 
the intercorrelations of three variables: bombardier stanine B, navigator stanine 

; and pilot stanine P. The first diagram shows the correlation between the same 
variables, B and N, as reported in Table 8.2, but consolidated as one group of 
3000, heterogeneous with respect to the pilot stanine. Here the r is .716, which 
may be compared with a median of .627 for the nine groups separately. The 
other correlations for the total group of 3000 аге ғар = .737 and ғур = .473. 

Computation of Partial r. For the special instance of a first-order partial (with 
the two primary variables residualized with respect to a single control variable), 
Formula 8.1a may be translated into zero-order variances, covariances, and 
betas, and then into zero-order r’s as follows: 


ae Васа ry — наза 
V Via Мы VVi— ВС МУ, BiaCia М1 = М1 = 


Substituting the three correlations in Formula 8.1 yields 


Fija = 


(8.1) 


Tan — ҒнрҒур -716 — .737 X .473 
Vi-rbVi-rn У (7) V1 азу 


"вм.р = 


617 


which is approximately the same as the median ғ,, for groups homogeneous 
with respect to variable P. 
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TABLE 8.3. CORRELATIONS OF THREE VARIABLES (BOMBARDIER & 
NAVIGATOR N AND PILOT P STANINES) FOR TOTAL GROUP. N = 3,000 


N 
B 1 2 3 4 5 6 7 8 9 fa 
B 1 1 1 10 18 25 57 113 
8 3 5 28 56 62 42 196 
7 1 5 15 59 Ii 93 65 18 367 
6 7 19 54 106 143 105 33 7 474 
5 2 16 6 16 200 156 77 17 3 653 
4 13 38 109 134 129 83 28 6 1 541 
3 21 49 101 97 60 36 6 370 
2 38 38 44 27 18 5 2 172 
1 45 24 29 10 4 2 114 
fy | 121 172 373 457 582 574 385 208 128 3000 

rgy =.716 

B 
Р 1 2 3 4 5 6 7 8 9 fp 
9 1 9 30 35 53 39 57 224 
8 3 14 23 37 59 47 35 218 
7 5 5 21 57 88 99 69 15 359 
6 5 14 58 155 141 109 25 4 511 
5 1 13 53 132 180 103 39 13 2 536 
4 8 27 121 190 157 58 7 3 571 
3 14 39 102 83 45 и 1 295 
2 46 57 63 3 6 1 204 
1 45 26 8 3 82 
fe 14 172 370 541 653 474 367 196 113 3000 

rpp = -137 

P 
N 1 2 3 4 5 6 7 8 9 fx 
9 [ 1 з n 10 32 34 36 128 
8 2 5 20 23 44 48 29 37 208 
7 2 6 23 54 68 63 B 44 47 385 
6 8 22 37 16 102 127 73 56 43 574 
5 5 3 71 108 114 118 68 32 35 582 
4 14 34 52 15 92 83 39 10 18 457 
3 16 55 58 95 80 40 16 8 5 373 
2 15 18 23 49 38 19 5 3 2 172 
1 22 235 25 21 8 7 2 1 121 
fe 82 204 295 571 536 511 359 218 224 3000 
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It is too much to expect that all investigations would show such a close cor- 
Tespondence between the partial r and r's in homogeneous groups, but in this 
Case the statistical method of control gives a result practically identical with 


that found by experimental control. 


HIGHER-ORDER PARTIAL CORRELATION 
A. variable can be residualized with respect to any number of secondary 
variables. The residual variable so obtained is uncorrelated with each of 
the Secondary variables and may be considered the part of the original 
variable that is independent of all the secondary variables. . 

When the correlation is found between two variables, both of which have 
been made into residuals with respect to an identical set of two or more 
Secondary variables, the result is a partial correlation of the order indicated 
by the number of secondary variables. The interpretation is the same as 
for a first-order partial; that is, it can be taken as an estimate of what the 
Correlation would be in groups homogeneous with respect to the secondary 
Variables, . 

While multiple correlation is useful in making predictions for practical 
Purposes, the value of partial correlation is as a tool in control and in 
a nderstanding relationships obscured by the presence of correlated vari- 
ables. 


COMPUTATION OF PARTIAL 7 

Partial ғ of any order may be found rather simply by the same 

Toutine as described in the preceding chapter for finding multiple 
this routine, a matrix of variances and covariances of a given or 
Tesidualized with respect to one of its variables so as to form a mat 
Variances and covariances of the next higher order. For any higher- 
Variance, the corresponding standard deviation is found merely by 
Ing the square root. To find any partial correlation, the covariance o Y 
Tesidual variables is divided by their standard deviations. Let 7 and j : 
any two variables that have been residualized with respect to апу number о 

Secondary variables, collectively represented as q. Then the basic formula 


Ог any partial r is 


matrix 
R. In 
der is 
rix of 
order 
act- 


—— - (8.1a) 
UC Wig Via 
iance between the residual variables in 


i j Я А 

n Which Сү, is the partial covar T | 
are the two variances. In the matrix 

q 


2 form, z,, and j 
қ Hg. z; 4, and V; , and Vj. 2 1 
method, which is exhibited in Example 8.4, the three coefficients required 


fora Partial r of a given order are all found in the same matrix, the matrix 
developed by the elimination of the secondary variables specified for the 


Biven partial. 
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EXAMPLE 84 


COMPUTATION OF PARTIAL AND PART r 


Strategy of Partial r. Except for the final steps, the matrix method of finding 
higher-order partial r is computationally the same as finding a multiple R. The 
steps are: 

1, Write the matrix with r’s (that is, covariances in z form) above the diagonal 
and the 1.00’s (that is, variances in z form) in the diagonal. Variables to be par- 
tialed out are in the first rows and columns; variables for which residual r’s 
are to be found are in the final rows and columns. As shown in Table 8.4, a 
sum variable may be developed to check on the accuracy of the arithmetic. 

2. In each matrix the covariances in the top row are divided by the variance 
in that row to form a vector of betas, which, for convenience, are placed beneath 
the variance. 

3. A new matrix is formed with one less row, and one less column by applying 
the formula Е’ = E ВС; that is, any element in the new matrix is the cor- 
responding element in the original matrix less the product of the beta (in the 
same row but the first column) and the covariance (in the same column but the 
top row). 

4. The process continues until the matrix of the desired order has been found, 
a matrix in which the variability associated with the secondary variables (that is, 
those to be **held constant") has been partialed out. 

5. Each covariance is then divided by the square roots of the corresponding 
variances, one of which is in the same column and the other of which is in the 
same row. This yields the desired partial r. 

In order to find a matrix of higher-order partial r’s, corresponding lower-order 
partial r’s need not be found. Only lower-order variances and covariances are 
needed. 

When only a single partial r is needed, the initial matrix will consist of the 
secondary variables and the two primary variables. The partial r will be found 
from a 2 by 2 matrix consisting of a single partial covariance and two partial 
variances. 

In Table 8.4 the primary variables are two scores on a learning test, a pretest 
in a course in Navy technical training and a final achievement measure in the 
same course. The secondary variables are three aptitude tests. It is to be noted 
that the process of partialing out the three aptitude tests influences the correla- 
tion between the two learning tests very little, but does reduce to a considerable 
degree other correlations among the primary variables. The third-order partial r’s 
may be taken as estimates of the intercorrelations of the primary variables in 
groups homogeneous as to the three aptitude tests. 

Computation of Part r. Table 8.5 shows the computation of first-order part 
r’s, using data abstracted from Table 8.4. A matrix format is used in which variable 
6 is eliminated from the covariances between variable 7 and the three aptitude 
tests. The partial variance of variable 7, after variability associated with variable 
6 has been partialed out, is .8319. The square root of .8319, or .9121, is divided 
into the partial covariances in the column headed 7.6, to find the first-order 
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part r’s in the last column. The procedure exploits the fact that in reduced z form, 
a part covariance and a partial covariance are numerically identical. 


TABLE 8.5. COMPUTATION OF FIRST-ORDER PART r's 
(Data from Table 8.4; N = 213) 


VARIABLE 6 1 2 3 7 7.6 
6. Achievement pretest 6 1.00 2 21 38 41 
1. General classification test 1 26 хх хх хх 33 2234 
2. Arithmetic test 2 21 xx xx 32. 22339 
3. Mechanical test 3 .38 xx .47 3142 
7. Final achievement 7 41 1.00 .8319 


The three partial covariances and the partial variances in the column headed 7.6 are formed from the 
entries in the preceding column, as variance 6 is eliminated. Thus 


Сіз.6 = .33 — (26 X .41) = 2234 ruro = .24 

Саз. = 32 — (.21 X .41) = .2339 17.6) = .26 

Core = .47 — (38 X 41) = .3142 қаза = 234 
Division byV V7.6 or V.8319 yields the part r as indicated. 


The part r's may be interpreted to mean that the aptitude tests are related to 
gain in the course. 

By substituting the part r’s for the zero-order r’s in column 7, and disregarding 
the original entries involving variable 6, it would be possible to find the multiple 
correlation between the three aptitude tests and the residual criterion, 2,6. 


While partial r’s are often lower than the corresponding zero-order r's, 
there is no necessity for the correlation to be decreased through partialing 
out one or more secondary variables. In some cases, it will increase. The 
limits of partial r, as are those of zero-order r, are —1.00 and +1.00. 


PART CORRELATION 


When one of the variables entering into a correlation is a residual and the 
other is not, the result is a part or semipartial correlation. In a part cor- 
relation, one of the variables has been modified by subtracting the portion 
or portions of each value correlated with one or more secondary variables, 
while the other is unmodified (except perhaps by a simple linear transfor- 
mation that has no effect ona correlation). Thus Zo 13 „сап be correlated 
with z,, and the result may be indicated as го; И 

In "p(0.12...n)> апу number of variables have been controlled insofar as 
they are related to variable 0, but their relationships with variable p are 
unchanged. The situation is somewhat analogous to control by matching 
or equating, in which the correlation between the independent variable 
(which varies with group membership) and any controlled variable is 
zero, but in which no restrictions are placed on the correlations of the con- 
trolled variables with the dependent variable. 
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т. find a part r, as in finding any correlation, the covariance between the 
о variables concerned is divided by their standard deviations. It is 
especially helpful in working with part r to use matrices of variances and 
oe of z scores and higher-order z scores. The reason is that, in 
ae z form, a part covariance (between an unmodified variable 
pecan i happens to have the same numerical value as a partial 
a ce involving two sets of residuals (each with the same secondary 
аа ч Also, іп z form, the standard deviation of the unmodified 
ensi is 1.00. Accordingly, to find a part r, the appropriate partial 
ariance is divided by a single partial standard deviation, the standard 

deviation of the residual variable. Thus 
Cop.12 ...n = Copa (8.2) 


r =-= ог г = 
р(0.12...п) р(0.4) 
V Voi. V Vos 


The computation of part r from the general matrix solution was illus- 


trated in Example 8.4. 


APPLICATIONS OF PART CORRELATION 
n is in the study of the correlates of 


for example, gain can be defined as 
d at a second point in time, Z2, 
at an earlier point in time, 21. 
s that portion of z; that is un- 
h another variable, such as 
hich gives the relationship 
the change that occurs in 
d first and the time it is 


— application of part correlatio 
ihat ge. In an experiment on learning, 
pedii D. of the proficiency measure 
By ав from the proficiency measured | 
знања gain as a residual, тал» gain is tl 
za fe e with гү. When 2:1 is correlated wit 
etw e result is a part correlation, 73(2.1» W 
corns, variable 3 and a score representing 

е 2 between the time that it is measure 


Measured again. 
js correlation can be applied when there is decline in capacity, as in 
S age. It is also a way of handling measures of growth. Whenever we 
анаа a ratio (such as 1.0.) with an outside variable, we are, in effect, 
ith se a part correlation. It will be actually a part correlation if the 
the ні is made іп the group on which the 1.Q. has been standardized and if 
age -Q. has been constructed so that it correlates zero with chronological 
sa variable entering into a part correlation may be of any 
at is, there may be any number of secondary variables, the number 


Of Which ; 
which is the order of the part r. 


MU 
LTIPLE CORRELATION OF RESIDUAL VARIABLES 


Resi ; i i i 
ie ean variables can also be used in multiple correlations. In multiple 
ial r, a number of variables are residualized with respect to an identical 
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set of secondary variables. One residual variable is then used as the cri- 
terion; the others, as predictors. 

Somewhat similarly, multiple part r is possible, as when a residual is 
used as a criterion and a number of unmodified variables are the predictors; 
or when an unmodified variable is the criterion and a number of residual 
variables are the predictors. Other special varieties of correlation can be 
developed, if needed by the logic of the particular situation. In any applica- 
tion of the correlation of residuals, the particular coefficient needed is the 
one required by the logic of the situation. If we wish to deal with observed 
variables (as in most prediction problems in applied psychology), no 
residualization is needed. If we wish to infer what correlation might be 
expected if one or more outside variables were “controlled” or “held con- 
stant” (that is, if the correlation were computed in groups homogeneous 
with respect to the outside variables), then it may be of interest to residua- 
lize one of the variables, or both, entering into the correlation. 


PRACTICAL APPLICATIONS OF RESIDUAL OR REGRESSED SCORES 


Residual variables are seldom, if ever, used in regression equations. 
However, a residual or “regressed score," often in the form of a quotient, 
is sometimes of interest in estimating a psychological characteristic, when 
allowance is made for one or more correlated conditions. 

Accomplishment quotients (educational age divided by mental age), 
educational quotients (educational age divided by chronological age), and 
other ratios have been advocated from time to time, but a persistent draw- 
back has been their characteristic negative correlation with the denomina- 
tor variable. This can be avoided if the derived scores are constructed so as 
to correlate perfectly with a set of scores representing the numerator 
variable residualized with respect to the denominator variable. One good 
method would be to find the residuals and then convert them into a series 
of standard scores with assigned mean and standard deviation. 


COMPARISON OF MULTIPLE AND PARTIAL CORRELATION 


Study of Examples 7.1 and 8.4 shows that the preliminary computational 
steps in multiple and partial correlation are exactly the same. A matrix of 
variances and covariances in z form are residualized with respect to one of 
the variables constituting the matrix. The process may be continued 
through successive matrices until the resultant partial variances and со- 
variances are of the order of the desired partial ғ. At that time, the partial 
correlation is found by dividing a partial covariance by the partial standard 
deviations of two variables concerned. If a part r is desired, the partial 


covariance is divided by only a single partial standard deviation, that of the 
residual variable. 
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M 1 э . . 
We career А-и dene dicia 
Ў 25 : ; 2 , there is а correspond- 
Ing r if the partial variance is of the first order, or a corresponding R? if 
the partial variance is of the second order or higher. To find the r? or R? 
e partial variance (which in the matrix method is in reduced z es 

orm) is subtracted from 1.00. Extracting the square root of the result 
em r or multiple R. There is no point in finding r this way because r was 
re nin order to find the partial variance. The situation is different, how- 
er, with R, which can be found only in some indirect fashion. 

In general, when it is necessary to predict an unobserved criterion on the 
basis of an observed set of predictor variables, the multiple correlation 
va d regression weights) is used as worked out on an earlier 
nieni of cases. When the aims of the investigation are scientific, that is, 
elat anding relationships among the variables, some form of the cor- 

ation of residuals by which one or more variables are controlled statis- 


tically is often appropriate. 


SUMMARY 

mpt to ascertain the relationship 
s of additional variation, either 
bscure the results. Accord- 
o reduce the influence of 


In psychological investigations that atte: 
between variables, the presence of source: 
Ми, the situation or within individuals, may o 
Nes » methods have been developed which aim t 
ables that are not of direct interest. 
me methods of control include complete elimination of a 
le e, use only of cases with a uniform degree of the variable, the sorting 
оз atched cases into groups, the balancing of the means and standard 
eviations in two or more groups, and randomization. These experimental 
methods of control aim to reduce to .00 the correlation between an inde- 
Pendent variable and a controlled variable. 
Statistical control has a similar objective, but in effect it involves the 
modification of the values of the independent variable and the dependent 
ы or both, so that the values become uncorrelated linearly with the 
ariable or variables controlled or partialed out. 
"s partial r, both primary variables are modified so as to become un- 
related with each of the variables controlled or partialed out. A partial 


r . 
сап be taken as an estimate of the correlation that would be found between 


t % А эр 
€ primary variables in groups in which there was no variation in the 


se ч 
Condary or controlled variables. 


с. Part r (sometimes called semipartia 
5 the other is the same type of residual variable that enters into a partial 


r, ; H 
in studies of change, as in learning, part 7 appears to be a more appro- 
ate technique than partial r. In all cases the logic of the situation deter- 


mi 5 > 
nes the appropriate type of coefficient to use. 


1 г), one of the variables is unmodi- 
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EXERCISES 


1. If the digits in Table 8.1 are truly random, then the expected correlation 
between any two columns is .00. However, it is extremely unlikely that any r 
would be precisely .00; rather, a large number of r would vary around a mean 
of .00, with a standard deviation of 1/V N — 1. 

Compute the correlation between the digits in one of the first five columns 
and the digits in another of the first five columns, and decide whether the 
obtained r constitutes evidence against the hypothesis of random arrangement. 


2. In a learning study, the correct response varies among five choices: A, B, C, 
D, and E. Since the choices vary somewhat in difficulty, the decision has been 
made not to use A, B, C, D, and E completely at random, but rather to arrange 
them in cycles of 10, with each response being correct twice in each cycle. 

Devise a plan, involving a table of random numbers, so that the sequence 
of correct choices in each cycle of 10 will be developed at hazard. Apply the 
procedure in writing out a sequence of 10 cycles. 


3. Sixty subjects have been arranged and numbered in order from 01 to 60 on 
a variable to be controlled experimentally. Devise a plan by which one mem- 
ber of each set of three subjects is to be assigned to group A, the second to 
group B, and the third to group C; then, with the help of a table of random 
numbers, make the assignments. 


4. Consider the following variables, z, and z,, and corresponding standard 
scores X and Y, (with M’ = 30 and s’ = 6). 


CASE 2%. 2, X Y 
А.А. 1.50 .50 39 33 
В.В. .50 1.50 33 39 
Gc .00 —.50 30 27 
D.D. —:50 .00 27 30 
E.E. —1.50 —1.50 21 21 


(a) Compute values of the ratio X/ Y, and correlate this ratio with Y. Why is 
this correlation negative? 

(b) Compute values of the modified ratio X"/ Y (in which X^ has a mean of 
30 and standard deviation of s,/r,,), and correlate this ratio with Y. Why 
is this correlation approximately .00? 

(с) Correlate the modified ratio ХУ with the residual variable z, „ ог 
(Zz — гг). Why is this correlation approximately 1.00? 


5. By use of methods to be discussed in Chapter 16, it can be proved that if, for 
any three variables į, j, and k, (1 — ri?) (1 — ra?) (rj ута) 2 0, then 
first-order partial r’s involving the three variables are possible. On the other 
hand, if the expression is negative, such partial r’s are impossible. 

Develop a set of three correlations that meets the criterion, and another 
that does not. From the first set, compute a partial r and note that it falls 
within the limits of ғ, —1.00 and + 1.00. Also, by computation, note that an 
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attempt to find a partial r from the second set does not yield a coefficient 
within the limits for r. 


n technique determine whether height appears to be 
mogeneous as to weight, age, and 
from Jones (2). (Since N — 42, 


6. By the partial correlatio: 
related to success in basketball in groups ho! 

| previous basketball experience. Data are 
results should be considered very tentative.) 


1 Weight к 
2 Age 52. 09 .5 

| 3 Basketball experience .00 .58 
4 Height 26 
5 Success in basketball 


is to explore the unique contribution of a 
From the following matrix, reported by 
proportion of the variance of the criterion 


7. One application of part correlation 
variable in predicting a criterion. 
Stern and Gordon (7), determine the 


predicted uniquely by variable 1; that is, find ro(1.2345). 


4 3 2 1 0 


58 


59 Al 28 54. 


| 5 General classification test — . 
4 Arithmetic test 25 40 59 .38 
3 Mechanical test 12  .36 36 
2 Clerical test 36 8 
1 Oral directions test 36 
0 Recruit Achievement test 


in Chapter 17) may be conceived as partialing 


1 variables until the original variables, resi- 
cal variables (or factors), have inter- 


8. Factor analysis (treated briefly 
Out one or more hypothetica 
dualized with respect to the hypotheti 
correlations of .00 within sampling error. 

For the following matrix, developed by Bonser in 1910 and quoted by 
| Spearman (6), the correlations with g, the general factor, are given in the top 
d decide whether the 


| aad and first column. Partial-out g from the matrix ап 
| ipe partial r's might differ from .00 only as much as might be expected 
У chance. (Precise tests for the significance of correlations are given in 


Chapter 13.) 


М = 157 
VARIABLE 4 1 2 3 4 5 


g 
| 1 Mathematical j 100 ооо 485 400 397 

$ io ical judgment 4 3 Б > -295 
| 3 oes association 1.000 37 397 24 
| Р ау interpretation 1000  .335 .275 
| H elective judgment 1.000 m 


Spelling 
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SPECIAL MEASURES 
OF RELATIONSHIP 


9 


ADDITIVE PROPERTIES OF VARIANCES AND COVARIANCES 


for each individual or case, the scores 
form a composite or sum variable, 
f the variances and covari- 
r total variable and 


8 be readily shown that when, 
the Wo or more variables are added to fori 

variance of the new variable is a function o 
ances of the constituent variables. If T is the sum 0 
x, X; + X, are the constituent variables, then 


T-X,4 Xin tk (9.1) 
Summing all terms and dividing by N, 
iT ХХ, ХХ, УХ, 
УТ ХХ 552,4. ра 9.2 
EUN ug md N (9.2) 


and each mean can be subtracted from 


All terms in Eq. 9.2 are means, 22 
as deviations from their 


the Corresponding term in Eq. 9.1. Accordingly, 
Tespective means, 


t= xg toto +% (9.3) 

a Squaring both sides of Eq. 9.3, summing all terms, dividing by N, and 
tranging the terms in a square format or matrix, 

215 
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зі? Ex? Dex, Exx 
ЕР -— y 9.4 
N ҮК у e (9.4) 
Eek, Xx, Ххх, 
М + N Шш М 
Ex; ®х%» Xx 
N N тея N 


Substitution of variances (V) for terms involving sums of squares of 
deviations, and covariances (C) for terms representing sums of products of 
deviations yields 


Ka Vt te + Cy, 


+ Cy, Ы + Cog 
(9.5) 


+ Си + Cato + V, 


Each variable appears in both a row and a column. The variances form 
the main diagonal and сап be designated as V or s?. The table is symmetrical 
about the main diagonal in that Сү, = Су. 

Since the procedure starts with raw scores and deviations from the means 
in terms of original units, the variances are not necessarily unity, and the 
covariances are not necessarily correlation coefficients. Since any covari- 
ance may be converted into an r by dividing by the standard deviations of 
the two variables concerned, any covariance, C;;, can be replaced with 
ТҮУ 

Тһе variances and covariances may be summed separately. Thus, 


n [май п 
=) И+2 > €, (9.6) 
Есі іші ј=і+1 
The expression 
n-1 n 
> Cy 
іші j=it1 


refers to a special summation in which each covariance is added only once, 
although it appears twice in the total matrix. 

Formula 9.5 is perfectly general. In the special instance in which 
variables i and j are added, the result is 


Ving =Vit2C,,+ V, (9.7) 
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or 
Марта + 2risis + s? (9.7a) 


COVARIANCES OF COMPOSITE VARIABLES 

A. formula for the covariance between any variable and the sum can be 
found by multiplying all terms in Eq. 9.3 by the variable in deviation form, 
summing all terms and dividing by N. Let the variable be x,. Then 


Уху Хх Exyxi хіх; 
М М * N Ы N 
and 
C, =й +С + +С (9.8) 
Ог, more generally, replacing variable 1 with variable i and summing the 
covariances, 


с,=и+Ў Cy GAD (9.9) 


any variable except i. 


It is to be noted in Formula 9.9 that j refers to 
f Formula 9.9 by 55. 


The correlation is found by dividing both sides o 


hen 
C VW +) Су 
Zt o rp = = (#0 (9.10) 
545% SiS, 


The covariances between two variables, each of which is a sum of two or 


More constituent variables, is simply the sum of all the covariances con- 
cerned. If one sum variable T, is composed of 7 subvariables, and the other, 
Т,, is made up of m subvariables, the nm covariances must be summed to 
find the total covariance. The expression fa = Xi xo + Xm Which is 
in deviation form, is multiplied by љ = 2 + X2 onc Xm The nm pro- 
ducts are summed and divided by М. This yields 


Sti, ххх Ххх) Xxx, 
a? = paid ai ГАС 

x N + ер N 
Exjx, , XXaXj 112. Exjx, 
вант" 
Хх,х; EXmXj XXX. 

та , my. 
us M N 
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which can be turned into a matrix arrangement of covariances as follows: 
Coas = Cut Cit С, 


+ Cart Coy t = + Са, 
(9.11) 


SO Cid dS 


Formula 9.11 contains mn covariances, which, when summed, yield the 
covariance between the two sum variables. The corresponding r is found 
by dividing the covariance by the standard deviations of the two variables. 
These standard deviations are obtained by extracting the square roots of 
expressions of the type exhibited in Formula 9.5. 

The use of raw-score variances and covariances to infer variances and 
correlations of combined variables is demonstrated in Example 9.1. 


EXAMPLE 9.1 


ADDITIVE PROPERTIES OF A VARIANCE-COVARIANCE MATRIX 


The Basic Principle. From a variance-covariance matrix, such as Table 9.1, 
correlations of the weighted or unweighted sums of constituent variables may be 
found quickly and easily. Any composite variance or covariance is developed by 
adding all elements of the square or rectangular matrix of the basic coefficients. 
Prior to summation, any variable may be weighted positively or negatively by 
appropriate multiplication of elements involving the variable. 

Similar operations are possible with a matrix of correlations with unity in the 
diagonal, as shown in Table 9.2. This may be considered a variance-covariance 
matrix of variables in z form. The correlations in Table 9.2 have been computed 
by dividing covariances shown in Table 9.1 by the square roots of the two cor- 
responding variances. 


TABLE 9.1. VARIANCE-COVARIANCE MATRIX OF SIX VARIABLES 


(N = 239) 
1 2 3 4 5 6 
l. Reading test 3176 2.455 1.867 1.820 1.395 .571 
2. Linguistic test 2.455 3.646 1.748 1.625 .909 .324 
3. Quantitative test 1.867 1.748 3.857 2.343 1.809 .592 
4. Mathematic test 1.820 1.625 2.343 3.468 1.928  .757 
5. Algebra test 1.395 .909 1.809 1.928 3.520 1.182 
6. Grade in mathematics 871 .94 .92 757 1.182 14177 


Ап Illustration. Suppose the correlation is desired of the sum of the two verbal 
tests, variables 1 and 2, with the sum of the three numerical tests, variables 3, 
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4, and 5. This correlation is, of course, the covariance of the two sum variables, 
Cussyuss4ssp divided by the two standard deviations, 5,,. and s5,4,5 which: 
are the square roots of corresponding variances. From Table 9.1, the covariance 
is found by Formula 9.11 as follows: 


Cassii) = Cis + Cra + Cas + Cos + Coa + Cos 
= 1.867 + 1.820 + 1.395 + 1.748 + 1.625 + .909 = 9.364 


The two variances follow: 


Visa = V, + 2C, + Va = 3.176 + 2(2.455) + 3.646 = 11.732 


Куыш = Va + Va + Vs + 2(Саа + Cas + Cas) 
= 3.857 + 3.468 + 3.520 + 2(2.343 + 1.809 + 1.928) = 23.005 


Accordingly, 
fao Ci1429(344+5) 9.364 
+ = 
armen У р М11.732 23.005 


T. 
ABLE 9.2. MATRIX OF z SCORE VARIANCES AND COVARIANCES 


(Original observations the same as for Table 9.1.) 
1 2 3 4 5 6 


тэш = IE. = M. нн ғы рар 
1.00 2 B 5s 42 30 


l. Reading test 

2. Linguistic test 7 100 47 46 25 46 
3. Quantitative test з ат 100 44 49 28 
4. Mathematic test 55 46 64 100 .55 37 
5. Algebra test 22 25 49  .55 100 258 
6. Grade іп mathematics .30 46 28 237 58 1.00 


т the original variance-covariance matrix is іп single-digit coded scores 
wh approximately equal variances, correlations obtained from Table 9.2, 
ere the variables are in z form, would be expected to differ very little. Here, 
2.68 


53+ .55+ 42+ 47+ 464.25 _ I 
4344 V636 ` 


ға 
fades) — y TO x (123 + 2064 4- 49 1:55) 


d ao in which raw-score variances differ widely, results from simple sum- 

Уйа] may also differ considerably. However, by differentially weighting the 

es in one matrix or the other, precisely identical results may be obtained. 

ig — mM with Differential Weighting. Let it be presumed that there is some 

соге to find the correlation of twice the z-score in the algebra test less the 

Weigh in the reading test with the grade in mathematics. Since z scores are being 
ghted, the correlation matrix is used. Then, 


Ca cuim 2G — Сөз C X .58) — .30 = .86 
Қаза 4s 740: + = 5—1.68 = 3.32 
Сзгу-21)26 86 йі 


озы ~” ==— = Е A 
Уши VVe V332 V100 
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ADDITIVE PROPERTIES OF CORRELATION COEFFICIENTS 


Since a correlation coefficient is the slope of a line (or the tangent of an 
angle), r's as such cannot be added to obtain an average or value representing 
several r’s. However, when a correlation coefficient is considered as a 
covariance of z scores, it can be used in exactly the same fashion as any 
other covariance in inferring variances, covariances, and correlations of 
variables based on the summation of values of constituent variables. The 
process is illustrated in Table 9.2 in Example 9.1. With zero-order z 
scores, each variance is unity. If a matrix of correlations with unity in 
each of the diagonal cells is summed, each variable enters into the com- 
posite with equal weight. Formula 9.6 becomes 


и=п+2 Уу Yn (9.6a) 
ігі j=it1 
and Formula 9.7 becomes 
Ving = 2+ 2nj (9.70) 


Similarly, the correlation between a single variable, i, and the composite 
of all the variables added together, with equal standard deviations, is 
(from Formula 9.10) 


n-1 


1+ "ij 
Ci, -— à " ё ; ; 10: ) 
5,5 Tg = “n-i À (is j) (9. а 
n J^ Y AE fj 
ізі j=it1 


VARIANCES AND COVARIANCES OF WEIGHTED COMPOSITES 


Variances entering into composites may be weighted, and appropriate 
variances and covariances readily obtained. In general, 


t= их муху wax, (9.12) 


By squaring Formula 9.12 and summing and dividing by N, the variance 
of t is obtained. The correlations of г with simple or compound variables 
can be found by procedures similar to those used in developing Formula 
9.11. It will be noted that in the regular matrix-like arrangements of 
Formulas 9.4, 9.5, and 9.11, each variable appears in a row or a column 
or in both a row and a column. Working within the matrix, we can, in effect, 
weight each score in a variable prior to adding the variable to a composite 
variance or covariance. To do this, each element in the row assigned to the 
variable and each element in the column assigned to the variable must be 
multiplied by the weight w;. By this procedure, each covariance involving 
the variable is multiplied by м, but the variance is multiplied by w;?. 
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The matrix may be of variances and covariances of any type: raw scores; 
zero-order z scores (variances of 1.00 and covariances equal to r’s); or 
the residual variances and covariances of multivariate correlation. The 
Modified elements can now be added into combinations, in exactly the 
same fashion as in the addition of original elements. 

The procedure of working from a complete matrix of variances and co- 
variances is useful in theoretical work relating to the interrelationships of 
large numbers of variables, such as the items of a psychological test. Once 
the basic matrix is known, derived coefficients involving various combina- 
tions of the variables are determinate, and can be found through appro- 


Priate statistical operations. 


VARIANTS OF PEARSON г: /pt.vis, ф and p 


The Pearson product-moment formula for ғ, r -Xxy/Ns,s, or r =2z,2,/N, 


has three important algebraic variants: the point biserial r, the phi coef- 
ficient (denoted as ф), and p, the coefficient of rank correlation. These 
Coefficients are alike in that each can be derived from the Pearson formula, 
usually in raw-score form, merely by substitution of equivalent expressions 
for certain parts of the formula. No violence would be done were the Pear- 
Son formula applied directly to the data to which one of these special 
formulas is applicable. In each instance the original formula and its modi- 
fiction yield identical results. The modifications are for convenience ІП com- 
Putation. The formulas for the point biserial г and the phi coefficient take 
advantage of the fact that if a variable is dichotomous (that is, has only 
two possible values), its mean and standard deviation are known. directly 
from the proportions of cases in the two categories. Somewhat similarly, 
the formula for p takes advantage of the fact that fora tp N dierent 
ranks, beginning with 1 (that is, 1, 2, 3+7 N) the standard deviation Is 


VW? — 1/12 and the mean is (N + 1)/2. 


POINT BISERIAL 7 

A biserial correlation is a measure of relationship between à dichotomous 
Variable on the one hand and a continuous variable on the other. Let the 
Continuous variable be denoted as X and the dichotomous as Y. If all Y 
Values are either 1 or 0, E Y = М, which is t 
Ог receiving a score of 1. Accordingly, 


EY N (9.13) 


he number of cases “passing” 


P being the proportion of cases “passing.” | 
Tf all y values are either 1 or 0, then LY’ = №, since there are /У, cases 


With value of 1, and the square of 1 is 1. The value of the standard deviation, 
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starting with the raw-score formula, is 
s- 20 G * INe (=) 
"YN N ) “VN AN 


= [em ы-у 


(9.14) 


N? 

In making the substitutions above, advantage is taken of the fact that 
N — (N, + N,), in which М, is the number of cases “failing,” that is cases 
with value of 0. 

From Formula 9.14 it is seen that the standard deviation of a dichoto- 
mous variable coded as 1 and 0 is the square root of the product of the 
two proportions. Since all values of Y are either 1 or 0, it is readily seen 
that ZXY = EX, the sum of variable X for those cases with a Y value of 1. 

The point biserial r makes no assumption about the shape of the dis- 
tribution of the dichotomous variable. It is computed with the dichotomous 
variable distributed at two discrete points; hence the name, "point 
biserial.” The values assigned to the scores at these points are actually 
immaterial, although for convenience sake, they may be regarded as 1 
or 0. 

The X variable is regarded as continuous, although, as in the derivation 
of the usual Pearson formula for r, no assumption is made as to whether or 
not it is distributed normally. 

The raw-score formula for r can be modified into the point biserial as 
follows: 


УХҮ MM EX, M N, 
Ce N x a N *N XX,- MN, (9.15) 
pt. bis. 5,5, 4 ae S N.N, 
Nt 


An alternate form is found by dividing numerator and denominator of 
Formula 9.15 by Ny 


=X, N, 
N “М, М,- 
Tot. bis, = Р f= Me Mig Np (9.15а) 
n 5, N 
Sx 2 4 
N, 


in which M, is the mean of variable X for the cases in which Y is 1. 

There are many alternate formulas for the point biserial. The part in- 
volving the radical in Formula 9.15a can be written as (/р/4- Table Р, 
which includes this function, is іп the Appendix. The computation of the 
point biserial is illustrated in Example 9.2. 
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EXAMPLE 9.2 


COMPUTATION OF /pp.nis. AND ғ. 


On one part of an examination in educational psychology, students were 
marked satisfactory or unsatisfactory. Data for use in computing point biserial 
and biserial correlations between this part and the remainder of the examination. 


the continuous Y variable, follow: 


N=75 Np =35 М =40 
EX = 5923 УХ = 3084 EX, = 2839 
DX? = 501,395 Mp = 88.11 Ма = 70.98 
М, = 78.97 р = 35/75 = .467 
5, = 21.18 y = .398* 


* (Hei, 
(Height of unit normal curve at the point of dichotomy, from Table P in Appendix.) 


Computation of the Point Biserial. By Formula 9.15a, 


м,-М, |М, 88 — 78.97 [35 4 


М, 21.18 40 


Ррь. bis. = Ж 
т 


Computation is facilitated by entering Table P (Appendix) with p = 47 and 


finding //N,/N, or Vp/q as .94. 
Computation of the Biserial. If, as in th 
bers underlying the dichotomy is normally distributed, ғыз. is an appropriate 
tio un with which to estimate the correlation that would be found if informa- 
ona continuous variable were substituted for the dichotomy. 
Using the information given above, including the height of the unit normal 


Curve at p = 467, then, by Formula 9.22, 


(M»—Mjp _ (88.11 — 78.97) (46D , 


ғы. — — yg 398x21.18 


is case, it can be assumed that the 


L AND OF BISERIAL 7 


APPLICATIONS OF THE POINT BISERIA 
asure of relationship between a 


2 Point biserial is used whenever a measure o. n 
i otomous variable and a continuous variable is needed and when it is 
appropriate to assume that a normal distribution underlies the dichotomy. 
Weeds. relate sex (which is certainly dichotomous) with a continuous 
mr le, we can arbitrarily assign values of, say, 1 to male and 0 to female, 
mu адыр the point biserial. The sign of the coefficient in this case is 
С. tary and depends оп the coding. Similarly, we can correlate race (if 

Wo classes only) or any other two-class categorical or ordinal variable 


ith a continuous variable. : 
here is a normally distributed, 


Conti i i 

Continuous variable underlying the dichotomy and when there is need to 

p e its correlation with the observed variable Y, the biserial correla- 
n, discussed later in this chapter, is appropriate. While the biserial r is, 
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strictly speaking, only an estimate of what would be found if more infor- 
mation were available, it has important uses. When a variable is imper- 
fectly measured and only dichotomous information is available, the in- 
vestigator may be more interested in the probable correlations of the 
underlying continuous variable than in the correlations of the observed 
dichotomous variable. In aviation training, for example, ability to learn to 
fly is probably a normally distributed trait, but in the military situation, 
only the fact of graduation or elimination from training may be available. 
The biserial r is more appropriate than the point biserial if the researcher is 
more interested in how the predictors are related to the basic underlying 
characteristic of “ability to learn to fly” than in how they are related to the 
fact of graduation or elimination from flying training. A formula for biserial 
r is given in a later section. 


THE PHI COEFFICIENT 


The phi coefficient (ф) is another algebraic variant of the Pearson product- 
moment formula for r. It is a measure of the relationship between two dicho- 
tomous variables, neither of which is considered to represent an underlying 
normal distribution. Each variable can be considered as having two pos- 
sible values, 1 and 0, and this assumption facilitates the development of a 
convenient computing formula. However, any pair of discrete values may 
be used for either variable without affecting ф. If two such “point” vari- 
ables are correlated by the ordinary Pearson formula, the result will agree 
precisely with the correlation as found by the formula for phi. 

To develop a formula for ф, the four frequencies in a 2 by 2 diagram 
(illustrated in Fig. 9.1) may be taken as a, b, c, and d. For the Y variable, 
there are two values, 1 and 0, and х У =£ Y? = (a + b), while M, (or p,) is 
(a + b)/N. Similarly, for the X variable, there are also two values, 1 and 0, 
and EX —ZX? = (b + d) and M, =p, = (b + d)/N. There are b cases that 
have scores of 1 in both X and Y. Accordingly, £ XY = b. It is also to be 
noted that (a + b + c + d) = №. 


х 
y [о 1 | p 
1 [| а | è | a+b 
0 | с | а | c+d 
fe | ate | b+a| м 


FIGURE 9.1 FREQUENCIES IN A 2 x 2 DIAGRAM. N=a+b+c+d 


To find expressions for the standard deviations, we utilize the fact that 
for either dichotomous variable, s = ,/р4. Accordingly, 
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„= се OF dero 


N 


Substituting values for EX Y, means and standard deviations in a raw- 
Score formula for r, 


a 2 
s= [OO = ETFS 


ZXY b_(b+d (a+) 
але жи or m 
5,5, Н 0+9 * (a + bye + d) 
bN — (b + dY(a + b) (9.16) 


= Ie + dla + да + bic + d) 

When a matrix of phi coefficients is to be found, Formula 9.16 is con- 
venient, since expressions of the type (b + d) and 4/(b + d)(a +e) can be 
found once for each variable. For each combination of two variables it is 
then necessary to find only b, the number of individuals in the “pass 


categories of both variables. 


An alternate to Formula 9.16 can be 4 
€ + d), multiplying out terms in the numerator and canceling where appro- 


Priate, and placing the terms in the denominator under a single radical. It 
thereupon appears that for two dichotomous distributions, 


Е bc— ad —— (9.162) 
r= O= (a bXa + ob + der d) 


Computation of ф is shown in Example 9.3. 


found by replacing N with (a +b + 


EXAMPLE 9.3 


COMPUTATION OF % AND Fer 

Computation of ф. In a 2 by 2 diagram, such as the one below, Ф or product 
Moment r between two dichotomous variables may be found by Formula 9.16 
ог Formula 9.164. Whenever а 2 x 2 measure is computed between variables 
that are truly dichotomous, or whenever à coefficient representing a 2 х 2 
relationship is to be used in developing a regression equation, is appropriate. 


Data. Among the items administered to 94 employees in a supermarket were: 
X: “How do you like the kind of work you do? 


Y: “How do you rate this store as à place to work?” | 
€sponses, originally obtained in multiple-choice form, were consolidated as 


follows: 
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X 
WORK BORING ОК WORK 
ONLY FAIRLY LIKED 
¥ INTERESTING VERY MUCH TOTAL 
Excellent place 
to work 7 (a) (b) 18 25 
Less satisfactory е 
place to work 44 (c) (d) 25 69 
TOTAL 51 43 N=94 
By Formula 9.16, 
ф bN — (b + d) (a + b) (18 x 94) — (43 x 25) 
2 V(b--d)(a--c)(a--b)(c--d) V43x 51 x 25 x 69 
617 232 
2 W3,7820925 ` 


Formula 9.16a yields identical results, since the denominator is unchanged 
and the numerator is algebraically identical; that is, (bc — ad) — (18 x 44) — 
(7 x 25) = 617. 

Computation of rier. Strictly speaking, ri, must be estimated rather than 
computed. For entry into the Cheshire-Saffir-Thurstone diagrams (1), frequencies 
must be reduced to proportions. The preceding example becomes: 


X 
WORK BORING OR WORK 
ONLY FAIRLY LIKED 
Y INTERESTING VERY MUCH TOTAL 
Excellent place 
to work 07 19 26 
Less satisfactory 
place to work 47 427. 74 
TOTAL 54 46 1.00 


Four estimates of rset. are made, based on the lower р (or q) in one variable, 
either proportion in the other, and the pzy at the intersection as follows: 


LOWER p PROPORTION 

(or q) IN 

IN ONE OTHER ESTIMATED 
VARIABLE VARIABLE Pry Ге. 

46 26 19 46 

46 74 27 .56 

26 54 7 58 

26 46 19 53 


Mean estimated rtet. .53 
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Actually, the diagrams use special notation and are self-explanatory. The 
point of interest here is that rtet. is obviously considerably higher than Ф. 
Pearson's Cosine! Approximation of re, Pearson (2), who developed riet., also 


Presented a useful approximation: 
180° 
Trot, = COS bc 
ad 
While various special tables for this function exist, any trigonometric table 
may be used. In this case, 


(9.16b) 


180° 0° 


— кес 18 o 
Ге, = COS 1 ТЕ x 44] ^ cos 3127 = cos 57.56 = .54 


7x25 : 
arge numbers of dichotomous vari- 


When working simultaneously with 1 
es in intermediate com- 


ables, it is sometimes convenient to use covarianc 
Putations instead of r's or $'s. The development is the same as for 
Formula 9.16, except that 5,5, and equivalents are omitted. Accordingly, 
the covariance for dichotomous variables coded 1 and 0 is 
«ЖҮ мм,- b (b+) (a+b) 

N N N N 


xy 
_ BN — (b + dXa + b) (917) 
N? 


APPLICATIONS OF ¢ AND Jet: 
Later in this chapter there is a discussion of tetrachoric r(r,4.), Which bears 
exactly the same relationship to $ aS ^з. 406510 Гы. vis. If the correlation of 
two dichotomous variables is used to estimate what the correlation would 
have been if it had been based on continuously measured, normally dis- 
tributed variables underlying the dichotomies, tetrachoric r is appropriate. 
Tetrachoric r is a good estimate of the correlation when continuous infor- 
mation has somehow been dichotomized and only the dichotomous infor- 


mation is available. 
‚ А great advantage of the phi coefficient is that, as à least squares solu- 
Pearson r system and it can be 


t i . . 
um without assumptions, it fits into any r | 
sed їп multiple and partial correlation and in computing regression 


equations. For this reason, the phi coefficient is generally preferable to 
"et. in handling the statistics of items. In this connection one application is 


а, 


1 ; | 

rin Ігізопотейгу the cosine of an angle is 

e апше that incorporates the angle. It i 
ypotenuse. 


S defined as the ratio of two sides of a 
s the side adjacent to the angle divided by 
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that the variance of the total variable can be found by summing the variance- 
covariance matrix of the constituent items; that is, the matrix with variances 
equal to pq and with covariances as found by Formula 9.17. 

When observed continuous distributions are highly skewed, and it is be- 
lieved that the skewness results from defects in the measuring instrument, 
Ға. 18 Sometimes computed to estimate what the correlation would have 
been if normally distributed variables had been obtained. However, when 
using information that falls naturally into dichotomies, such as sex, and 
sometimes race and marital status, the phi coefficient is generally preferable 
toss. 

One characteristic of the phi coefficient is that it is markedly affected by 
the proportions in the upper and lower categories of the two variables and 
can reach a maximum of 1.00 only when p, = p,; that is, when points of 
dichotomy in the two variables are exactly the same. 


SPEARMAN'S RANK CORRELATION COEFFICIENT 


A third variant of the Pearson formula is Spearman's rank correlation co- 
efficient, already discussed in Chapter 4. It is demonstrated in elementary 
algebra that the mean of a series of N ranks beginning with 1 is (N + 1)/2. 
It can also be shown that the sum of the squares of N ranks is N(N + 1) 
(2N + 1)/6. Substitution of these values in a raw-score formula for the 
standard deviation yields 


ай IE ЕТТЕ {= + 1QN +1) _ Е Hy A =1 09 18) 
N 6N 2 12 


Substituting Formula 9.18 and the value for the mean in a raw-score 
formula for r, and remembering that means and standard deviations are 
identical, for two sets of N ranks, we have 


IXY Ry 2 
м, жа, „ыу 


NOM. _ IDER,R, — 3N(N + 1)? 
"UN Жапақ. aR, — 3NUN +1)? о 19) 


8,5; шінші уг =] N(N? — 1) 


This is the “rank product” formula for p, the rank correlation coefficient, 
in which ХА, К, is the sum of the N products of the paired ranks. When 
there are no ties, it gives exactly the same result as the “rank difference” 
formula, which is a little easier to compute. The derivation of the rank 
difference formula starts with the difference formula for r (from Formula 
6.10); 
= Vat V, К-у 


ху” 
25,5; 
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Since any series of N ranks will sum to N(N + 1)/2, the differences 
between two such series must sum to zero. Accordingly, to find the vari- 
ance of the differences, it is necessary only to square each difference 
(denoted as D), sum, and divide by N. Accordingly, for a system of two 
sets of ranks ranging from 1 to N, 
xp? 


Key (9.20) 
Substituting Formulas 9.18 and 9.20 in Formula 6.10 yields 
М2-1 N?-1 ED xN?-1) =D? 
————emA— --Ұ 75 NE XD? 
жығы 7 12 N 12 N ap 8 (9.21) 
N?—1 [N?—1 2(N? — 1) N(N* — 1) 
ге x fe ae 
| 12 / 12 12 


This is the rank difference formula for r, denoted as p to show that it is 
computed from ranks instead of raw scores. It is seen that if the original 
data are two sets of ranks, Formula 9.21 yields the Pearson r between them. 
If, however, values on interval or ratio scales have been ranked, and then 
Formula 9.21 is applied, p is an approximation to the r that would have 
been found if the original scores had been correlated. d 

It is to be noted that the distribution of ranks, like the distribution of 
Percentiles, is rectilinear. Accordingly, if a continuous variable is ranked 
Prior to finding the correlation, some information is lost. In general, p will 
be a little smaller than r for the same data, but the difference is trifling. 
Since a major function of p is to obtain an estimate of the correlation from 
а small sample of cases, the difference between p and r is inconsequential. 
Computation of p is shown in Examples 4.4 and 4.5. 


"AS IF" COEFFICIENTS: ry. “tet. 


Both biserial ғ and tetrachoric r, in contrast 


Phi coefficient described above, are "as if" 
estimates of what the correlation would be found to be 
Variables involved were actually continuously measured and normally 
distributed and if the more complete information were substituted for the 


dichotomous information at hand. 


to the point biserial and the 


coefficients in that they are 
if the dichotomous 


BISERIAL 7 
A convenient formula? for biserial r is 
| (M, - MƏN, _ (M, — МӘР (9.22) 
Pis, = yNs, Sx 


ee | 
ЗА derivation of the formula for rps is given by Peters and Van Voorhis (4). 
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in which M, is the mean score on the continuous variable of individuals in 
the upper group, N, is the number of cases in that group, М, and s, are the 
mean and standard deviation of the continuous variable for the total group, 
and ?y is the height of the unit normal curve at the point of dichotomy, as 
found from Table P (Appendix). Example 9.2 showed the computing 
steps. 

While biserial r affords a useful estimate when the assumption of a 
normal variable underlying the dichotomy is justifiable, a computed гъ, 
cannot be considered exactly the same as the corresponding product- 
moment ғ. In the first place, its maximum value is not 1.00 but 1.25. In the 
second place, if the proportion of cases in one category or the other is 
less than .10, the coefficient is considered unreliable. In the third place, it 
may not be used in computing regression equations, since it does not 
belong to the Pearson family of correlation coefficients and since informa- 
tion for its use as a Pearson r is not available. It is strictly an estimate of 
what r would be if information not currently available became known. 

It may be noted that a series of biserials has a high correlation with the 
series of points biserials computed from the same data. For analyzing item 
data in psychological tests, it rarely makes any difference which coefficient 
is used, since item data are evaluated relatively and since items that have 
high point biserials will have high biserials, and vice versa. However, in 
presenting research results against a dichotomous criterion, the choice of 
the appropriate coefficient is a matter of some concern. If the criterion is 
truly dichotomous, the point biserial is to be preferred. In personnel and 
clinical psychology, however, dichotomous criteria probably represent 
variables that are truly continuous. The point biserial is such an under- 
estimate of the biserial that the biserial is to be preferred if the correlation 
between a continuous predictor and a dichotomous criterion is to be 
accurately represented. 


TETRACHORIC ғ 


As does the biserial, ғ, 
fora relationship bet 
algebraic variant of 


tet. involves an independent derivation of a formula 
ween two dichotomous variables rather than being an 
К the Pearson product-moment formula. Again, like 
biserial r, it is an “as if” coefficient in that it estimates what the correlation 
would be between two dichotomous variables if they were continuously and 
normally distributed and if complete information were available. 

The formula for r. , an infinite series of terms, is sufficiently complex 
that it is seldom used for computation, even with certain approximations. 
Instead, tetrachoric js are usually found from computing diagrams, such 


3 [t is to be noted that the 


А meaning of у here has по particular relationship to y as a 
variable. 
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as those by Cheshire, Saffir, and Thurstone (1), which are published in a 
format that is fully self-explanatory. Some investigators dichotomize con- 
tinuous data and then apply the diagrams to find the resultant tetrachoric 
E which is then taken as the Pearson r. While the procedure is expeditious, 
it has doubtful merit, since the coefficient is far less stable than Pearson r. 
: There seems to be little justification for using т in item analysis work, 
since it does not fit into the Pearson product-moment family of r, its com- 
putation is involved, and it is unlikely that the underlying variables are 
normally distributed. On the other hand, it is the appropriate coefficient 
whenever two dichotomous variables are correlated and the assumptions 
underlying its derivation are met. 
Finding of г, was illustrated in Example 9.3. 


CURVILINEAR CORRELATION 


Almost exclusively, measures of relationship used in psychological statis- 
tics are based upon fitting a straight line to two sets of observations. The 
regression line, in the fitting of which the sum of the squares of the errors 
has been made as small as possible, is considered the “ best-fitting " straight 


line. Its basic equation is 
p=bx+a 
slope, and a is its intercept. If 


in which is any value on the line, b is the 
ation coefficient and a becomes 


all values are in z form, b becomes the correl 
zero, 

It is easier to fit straight lines to bivaria 
any other function. That fact, however, shoul 
for other functions if they will better describe the re 


two variables under study. Л 

Curvilinear correlation starts with the premise that the dec Hm 
between two variables can be better described by the equation of a curve 
than it can be by the equation of a straight line. The principie begs: 
Squares still applies. A straight line is actually a special, simple type of 
Curve. A general method of curvilinear correlation will show that the best 
equation is a straight line when the straight line best fits the data; and in 
Other cases it will give the equation of the curve of best fit by the least 
Squares principle. i i 

The first and perhaps most important technique of investigating the 
shape of the regression line is the inspection of the scatter plot. A glance 
at a correlation diagram can usually reveal whether or not there is a possi- 
bility that the line of best fit is not linear. One disadvantage of machine 
computation of descriptive statistics is that scatter plots must be obtained 
as an additional step rather than as basic to the computation of the correla- 
tion coefficient. Systematic psychological манан БОШ, However 


te distributions than it is to fit 
d not deter one from looking 
lationship between the 
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include provision for the inspection of scatter plots. Ordinarily these can be 
obtained even when the primary mode of computation is by machine. 


ETA AS A MEASURE OF CURVILINEAR CORRELATION 


If the plot shows a tendency toward curvilinearity, the coefficient eta (1) 
may be computed. Eta supplies neither an equation of a line of best fit nor 
any statement of its shape. It is essentially a negative type of coefficient. 
When eta is higher than r, some line other than a straight line fits the data 
best. This line may or may not be a smooth curve. 

Eta operates by determining the average variance within each column 
(or row) in a scatter diagram and comparing it with the total variance 
of the same variable. Computation is shown in Example 9.4. 


EXAMPLE 9.4 


COMPUTATION OF 7 THE CORRELATION RATIO 


Nature of Eta. While the computation does not involve fitting a set mathe- 
matical function connecting two variables, 7 is often conceived as a measure of 
nonlinear correlation. A very irregular line of best fit can yield just as high an 
eta as will a smooth curve. The more the cell frequencies in the vertical arrays 
are concentrated, the higher one of the two etas, 7,2; while the greater the con- 
centration in the horizontal arrays, the higher yzy. If 775 are markedly different 
from r, then a search can be made for a function connecting the two variables 
more appropriate than the straight line automatically fitted whenever a correla- 
tion is computed. 

Data in Table 9.3. This table presents the bivariate distribution of the coded 
scores of 210 high school seniors on two achievement tests. The coded values, 
ranging from 0 through 9, are used directly as dz and dy. As in Example 6.2, 
each cell frequency is denoted as fzy. 

In either dimension, the total variance Vz or V, can be shown to consist of 
two components: the variance of the means of the arrays Vm, or Vm,; and the 
variance around the means. By definition, Nyz? is the ratio Vm yl Vv, and Ney? is 
Vm,/Vz. Somewhat similarly, ғ? is the ratio of the predictable variance to the 
total variance; that is, rzy? = Vz/Vz. Since the predictable variance is of values 
on the regression line (which is necessarily straight), while the means within 
arrays used in finding 7? may vary from a straight line, r? may equal 7, but cannot 
exceed it. 

The computation of ту involves the following steps: 

1. Find the fz's for the columns. 

2. Within each column, multiply each cell frequency, fzy, by its corresponding 
dy, and sum. 

3. Square each Ldyfry. 
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4. Divide each (Zdyfry)? by the corresponding fz and sum the quotients. This 
yields 
5 (Хауа): 

fz 

5. Multiply each fy by the corresponding dy and sum. This yields Zdyfy —Xy'. 
(If the cumulative method for finding Ly’ is used, the Cfy are found and summed, 
excluding the Cf in the 0 row.) 

6. Compute Ху? by multiplying each fy by d? (or by summing the products 
of the Cf, and the m or successive odd numbers). 

7. Find Пу: by the formula 


(duy : 
NEC RS OO? (9.23) 


МУ>у? — (dy)? 


which is the square root of the ratio Рт,/ Ру. 

By parallel operations within the columns, | гу can be found. 

To find rzy, the only additional information needed is Xx'y', which may be 
found as Edy(d;fz)); that is, multiplying each Udzfry by the corresponding dy 
and summing. Alternately it may be found as Ldz(Zdyfzv). 

All computations are shown in Table 9.2. 

Note on Testing the Significance of n. The greater the difference between y? 
and r?, the greater is the departure from linearity of regression. Testing for the 
significance of the departure involves the F distribution, discussed in Chapters 
12 and 14. In this case, F is found as follows: 


_ 0%—г®/т—2) 
а= — n) 
in which n is the number of row or column means and N is the total number of 


cases. In entering the F table, the degrees of freedom are (n — 2) and (N — n). 
The test may be applied both to nzy and "уг. 


Nyx = 


F 


In each scatter diagram, two etas may be computed, one describing the 
regression of x on y, the other, the regression of y on x. If the information in 
x is informative with regard to y, the variability within the columns will be 
very small compared with the total variability of y, and 7х will take a high 
value. If this value is much higher than r computed from the same data, it 
follows that the regression is curvilinear, but no equation of the line of best 
fit is available. There is a similar eta, у. computed within the rows, for 
investigating the variability of x compared with the total variability. The 
values of nyx and ",, may be quite different. 


FITTING A CURVILINEAR FUNCTION TO BIVARIATE DATA 


More satisfactory than eta is a statement of the relationship between the 
two variables that will yield the equation of the curve of best fit. Theoreti- 
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cally, such an equation might be of any type of curve. One of the most 
versatile curves in mathematics is the parabola, since it can describe a wide 
variety of relationships. It would appear to be the most logical curve to 
select for the description of relationships departing from rectilinearity. 
Fitting a parabola to bivariate data is shown by Peters (3). 

The interpretation of a curvilinear function of best fit parallels exactly 
the interpretation of the straight line of best fit. There is a standard error 
of estimate which is the square root of the variance of the residuals. The 
Process of fitting the function is a process of minimizing the sum of the 
squares of these residuals. It is also possible to apply to curvilinear correla- 
tion theory the principles of multiple and partial regression. However, up 
to the present time, these have been applied very little in psychological 
research, If and when observed data require the curvilinear relationship of 
Weighted sums and the curvilinear relationships of residuals, the required 
Procedures are either available or can be readily worked out. 


SUMMARY 


Implicit in any complete variance-covariance matrix (of which a correlation 


Matrix with 1.00's in the diagonal is a special case) are the varianci, 
standard deviations, and correlations of new variables defined as the 
Weighted or unweighi f the original variables. . 
ghted sums 0 1 . N 
The product-moment formula for r can be readily modified, for cna 
епсе in computation, to cover the case where one variable is dendum 
(ы, where both variables are dichotomous (à), or when both variables 
Consist of ranks ( 
р). : 
Тап estimate is needed of what the product-moment r would ^N ES 
Single dichotomous variable were continuously and normally e uted, 
then Ғы, May be used. For a similar estimate involving two dichotomous 
Variables th i tistic 15 "а. 
, the appropriate statis ал. А А 
If, for two continuously distributed variables a coefficient bens Fs 
markedly greater than r, the regression is not linear, and e is p 
Some nonlinear function may better describe the relationship. 
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EXERCISES 


1. Below is the complete variance-covariance matrix of eight arithmetic reason- 
ing test items. (/ = 1000) 


1 2 3 4 5 6 7 8 


194 (012 032  .030 .032 .030 .014 .029 
(012 215 .016 .041 026 .023 (018 .034 
32 016  .234 .035  .059  .073 .032 051 
1030 044 .035 .209 209 .036 .033 031 
32 .026 .059 .029 238  .051 .033 .056 
030 .23 073 036 .051 211 .036 .050 
(014 .018 .032 .033  .033 .036 .250 .045 
029 .034  .051 .031  .056 .050 .045 .250 


Сом) OV C Ашм м 


(a) Find the total variance of {һе eight-item test. 

(b) Find the correlation of each item with the total score. 

(c) Find the correlation between a test composed of the first four items and a 
test composed of the last four items. 


2. Below are the intercorrelations of six tests used in screening applicants for 
admission to a college of engineering. 


2 3 4 E 6 


1. Reading test 84 141 22: di I 
2. Language test 48 38 26 .34 
3. Reasoning test 29 .27 38 
4. Quantitative test 29 41 
5. Perceptual test 63 
6. Mechanical test 


Find the correlations between each of the six variables and a composite 
formed by adding all the variables (with equal standard deviations). 


3. Below are the joint frequencies (with total frequencies in the main diagonal) 
of six items on a reasoning test. (N = 248.) 
(a) Find the variance-covariance matrix. 
(b) Find the correlation matrix of ф coefficients. 


1 2 3 4 2 6 
1 185 147 66 125 105 104 
2 147 198 63 134 111 107 
3 66 63 87 62 57 43 
4 125 134 62 169 101 105 
5 105 111 57 101 140 83 
6 104 107 43 105 83 136 
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4. Compute ф fi ing di 
which d р 4 the following diagrams and then state the conditions under 


- + = + 5 di 
+ | [25] +| 0 | 16|] + 16 
=| 26 | =5| 20 | = | 26 | 10 
- + - + - + 
+[5|%| + 4| [1 | 16 
-[в 151 -| а — | 16 | 10 


5. s хе 
For the following artificial data, find rpt.vis- and ry;s.. In this case, are there 
800d reasons why Ғыв.> 1.00>rpt.vis.? 


x 
exl ]p1 * s 2 5 6 7 Е 9 10 
Pass — 1 19 И 9 6 2 

Fail =0 2 6 9 M 19 


ations of the information in Table 


6. " А 
The following diagrams represent consolid: 
he r of .63 when the information 


6. i Е 
Ст Find ғы. and ға. and compare with t 
n 15 categories for each variable. 


X 
= Y 70-99 100-129 130-159 160-189 190-219 
1 1 13 83 145 32 
0 21 81 151 52 3 
нек: жәнне 
y 0 1 
1 97 177 
0 253 55 


-F 
rom the sum formula, rzy = (Vz«v — Vaz — Vyl2szsv. develop a formula 
f the sums of paired ranks. 


fo; " 
T p that involves finding the variance 0 


: ri the data in Exercise 8, Chapter 6 (relationship between Army Alpha and 
ecd Beta for 1047 recruits in World War D, compute 1 and test whether 
€ regression is significantly nonlinear. 
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FORECASTING 


__HUMAN BEHAVIOR 
18 


ik Be МАША who ege one AN ШІН ГЇЇ "m 
"s HES maladiueted individuals та а УАЙ eee er v ШЫ 
Students for a professional school OF players fer se шананы Bays А 
making predictio e havior. — 

The — Ea » information related to his client's 
Probable success in different occupations: The clinical ng 
categorizes his patient, perhaps as mentally defective of d$ neurotic, and 
Proceeds to predict changes that would occur under diferent types of 
education or therapy. The personnel psychologist, by his act of selection 
OF classification, predicts success as a student or em 

Forecasting human behavior is hardly novel; it has been 
Varying degrees of success for thousands of years. Success in рге 
brought fame to soldiers and power to statesmen. 


ployee. | 
practiced with 


diction has 


PSYCHOLOGICAL PREDICTION 
Developments constituting the core of psychologica! reso where 
tion include: procedures for measuring predictor varia es; me ods o 
Scaling the criterion; and the use of quantitative methods within a sample 
to describe relationships useful in making predictions about cases yet to be 
Observed, 

239 
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Logically, psychological prediction involves four steps: 


1. Selection of a sample believed to represent cases for which predictions 
are ultimately to be made. 

2. Within the sample, observation of relationships between one or more 
predictor variables on the one hand and a dependent or criterion vari- 
able on the other. When there are two or more predictor variables, their 
interrelationships must be ascertained. 

3. Observation of the predictor variables for new cases and the use of this 
information, together with information about their relationships, to 
make deductions about probable values in the unobserved dependent 
variable. 

4. Verification by noting the degree to which predicted and actual cri- 
terion values agree in one or more new samples. 


An appropriate sampling method is basic to the establishment of almost 
any psychological generalization. Certain sampling methods will be dis- 
cussed in Chapter 13. This chapter concentrates on how forecasts are made 
and how a forecasting system may be tested. 


THE CRITERION IN PREDICTION 


Any time a nonzero relationship between two variables is discovered, one 
can, in a sense, “predict” one variable from the other. Often the relation- 
ship between a pair of variables is zero, and if no more information is 
available, such a relationship is useless for prediction. However, in the 
discussion of “suppressor” variables in Chapter 7, it was noted that under 
special circumstances and in combination with at least one other variable, 
a variable that is uncorrelated with another may be useful in predicting it. 
When it is as easy to gather information simultaneously on two variables 
as on one, there is no need to set up a forecasting system, even though all 
required data are available. For example, a positive dependable relation- 
ship exists between height and weight in a defined group, suchas 18-year-old 
males homogeneous as to racial, cultural, and socioeconomic background. 
Very easily, a system can be established for “predicting” height from 
weight or weight from height. With a large and well-defined sample, 
the proportion of error in such predictions can be ascertained, and the 
probability can be expressed that the discrepancy between a predicted and 
actual value will not exceed a stated amount. Ordinarily such “ predic- 
tions” have little interest because values on both variables are at hand. 
More usefully, forecasts are made in situations in which the predictive 
information is immediately available, but in which there is an interval of 
time before criterion data can be known. Typically, the applied psycholo- 
gist has scores from psychological tests, or data from projective devices, 
or information about the past history of individuals, which he uses as a 
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basis for decisions about people. When there are alternate candidates from 
whom to select for a training course or a position, or alternate possibilities 
for the treatment or placement of a person, the act of decision constitutes a 
prediction. Sometimes, especially in highly organized personnel programs, 
Predicted criterion values are actually computed and recorded before de- 
cisions about individuals are made. In less formal programs, decisions re- 
flect, more or less accurately, implicit predictions. 


SCALES USED IN PREDICTION 


Theoretically, any kind of scale ca 
interval, or ratio. Similarly, any type of scale 
mation, either on the same or on a different ki 
uses symptoms to diagnose a disease, he may be com 
information: reaction to a physiological test as a nomina. 
Tation on an ordinal scale; and temperature on an in 
Objective may be classification into a disease category, 
nominal scale. 

In abnormal psychology, nominal scales 
When responses to inkblots and personal inv 
tiate psychiatric categories. Both in vocational counseling and in personnel 


Classification, nominal scales can be used as criteria, but within each cri- 
terion group other scales may be used, such as degree of satisfaction with 
a t ^ 

n occupation or degree of success on the job. 


n be used as a criterion: nominal, ordinal, 
can be used to predict infor- 
nd of scale. When a physician 
bining several types of 
1 category; perspi- 
terval scale. His 
belonging to a 


are often used as criteria, as 
entories are used to differen- 


T 
ESTING PSYCHOLOGICAL PREDICTIONS 


Systematic testing of psychological predicti 
Presents a single dimension of excellenc 
Concerned, it makes little difference whether the criterion is merely in two 
degrees, like the pass-fail criterion used in studies of predicting the ability 
to learn to fly an airplane, or a continuous variable such as the grade point 
average used in studies of academic success in college. In the first instance, 
Prediction is in reference to broad categories; even when the predictors are 


Not very valid, it may be that a large proportion of the predictions will 
rocedures represent a con- 


t : 
E oe Out to be correct and that psychological p 

iderable improvement over other forecasting methods. On the other hand, 
When a continuous variable is predicted, any deviation of the predicted 


Score from the criterion score, when it becomes available, is likely to be 
ачен an error. Sometimes the fact that а large proportion of the total 
Tlànce turns out to be unpredictable is taken to indicate that psychological 
Prediction methods are of little value. Actually, in real-life situations one 
2 Usually more concerned with making the decisions that will improve the 
aliber of personnel than in knowing precisely the eventual numerical 


eva i À . 
luation the personnel will receive- 


on is possible when the criterion 
e. As far as the mechanics are 
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FORECASTING INSTRUMENTS 


EXPECTANCY CHART 


One of the simplest forecasting instruments is the expectancy chart, such 
as those illustrated in Figs. 10.1 and 10.2. In Fig. 10.2, one dichotomous 
variable is predicted from another; in Fig. 10.1 the criterion is dichoto- 
mous, but the predictor is nine steps and may be considered more or less 
continuous. 

The basis of an expectancy chart is a scatter diagram showing the rela- 
tionship between two variables. After a series of observations has been 
made, people are divided into categories according to the two variables. 
Usually there are two categories for the dependent or criterion variable and 

_ two or more for the independent or predictor variable. Either of the two 
categories of the criterion may be charted. In both Fig. 10.1 and Fig. 10.2 
percentage of success is shown. 

In Fig. 10.1 the nine categories represent different degrees of aptitude 
for pilot training. Actually, the nine-point pilot aptitude scale was based 
upon a battery of tests, weighted in accordance with the principles of 
multiple regression. 


PILOT 
APTITUDE 

SCORE NUMBER NUMBER 

"(STANINE") ELIMINATED GRADUATED PERCENT GRADUATED 


9 41 1048 (96% Zi 
8 108 958 [90% 71 
7 271 1394 

6 590 1649 [ 74% | 

5 sis 166 

4 so 927 

) s 355 

2 37 16 

i 6 


0 10 20 30 40 50 60 70 80 90 100 


FIG. 10.1. EXPECTANCY CHART FOR THE PREDICTION OF SUCCESS IN PILOT TRAINING 
FROM APTITUDE SCORE. Data from DuBois (1). 
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In using an expectancy chart in individual cases, information is available 
as to the person’s score on the predictive measure but not, of course, on 
the outcome. What we want to predict is the individual’s future. From Fig. 
10.1 it can be said that the expectancy of the success of a young man with 
an aptitude rating of 9 is .96, whereas if his rating is only 1, the probability 
of his success is only .27. Thus, if conditions of training remain constant, 
then for 100 individuals who have the top rating, 96 will be expected to 
succeed and 4 to fail. Of 100 individuals with rating of 1, we may expect 27 
to succeed and 73 to fail. 

Of course one cannot be sure what will happen to any particular person. 
If, however, the high relationship observed during the course of the in- 
Vestigations continues to obtain, then the predictions in individual cases 
Will tend to be correct. If the correlation were 1.00, both during the original 
investigation and subsequently, then all predictions would be perfectly 
accurate. If, on the other hand, the correlation were zero, predictions would 
be no better than blind guesses. With moderate degrees of correlation, pre- 


dictions are better than chance. 
GROUP PREDICTIONS AND INDIVIDUAL PREDICTIONS 


Essentially there is no fundamental difference between predictions for in- 
dividuals and the so-called actuarial predictions for groups. The wa is 
made up of individuals. If we make N predictions for a group of N in = 
duals, it is very likely that the mean of the predictions will come closer to 
the mean eventually obtained than will a prediction about a single in- 
dividual be close to his eventual criterion score. However, the аш of pre 
diction is exactly the same in both instances. Suppose, for example, th Ex 
аге 100 individuals with a pilot aptitude rating of 3. Experience men 
in Fig. 10.1 shows that 40 of these men сап be expected to succeed an 


to fail; ; ility of .40 of success. | 
9 fail; that is, each man has a probability B ilikely that the yield site 


When the 100 men are sent into training, i а 
Teasonably close to the predicted yield of Бае a ү и ах 
wit i d 0 will be close to .40. , 
і маіали coded 1 and failure code the 100 will succeed. Furthermore, 


18 NO way of identifyi ich 40 out of 

а entifying which 40 ou Ы €: 

Y reason of Mes in sampling, errors in the predictive measurements, 
and errors in measuring performance, the actual yield may be somewhat 


arger or smaller than .40. However, it has been found pes and Ex ae 
when actuarial predictions are made i umbers of cases, the fina. 
теин are not greatly different from w 
asis of prediction of death rates, voting 
fictions in which there is more interest in 
the individual cases. In fact, in voting 
actual behavior of any specific individua 
Senerally unavailable. 


averages than in what happens 
behavior, information as to the 
1 when he goes to the polls is 
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Prediction of Success of Deaf Children in Rotation Test 


N PERCENT SUCCESSFUL IN ROTATION TEST 


Success in Standing Test 39 [85 % 


Failure in Standing Test 20 5% 


Prediction of Success of Deaf Children in Standing Test 


N PERCENT SUCCESSFUL IN STANDING TEST 


Success in Rotation Test 36 192% 


Failure in Rotation Test 23 22% 


FIG. 10.2. EXPECTANCY CHARTS FOR DICHOTOMOUS VARIABLES. 
Data from Worchel and Dallenbach (4). 


What differs most from one case of prediction to another is the standard 
of accuracy. If we endeavor to predict a continuous variable, such as 
average grade in college, and if any discrepancy between the predicted 
average grade and the actual average grade is counted as an error, then 
predictions by means of regression equations are highly inaccurate. On 
the other hand, if the predictions are evaluated in broad categories, such 
as being successful in college as opposed to failure in academic work, then 
the errors may not be very numerous even though the correlation is fairly 
low. 


PREDICTING ONE VARIABLE FROM ANOTHER 


In the study summarized in Fig. 10.2, Worchel and Dallenbach (4) studied 
59 deaf children with hearing loss ranging from 32 to 100 percent. Their 
“rotation test" involves sensitivity to motion in a rotation chair, as indi- 
cated by dizziness or nystagmus or compensatory adjustments. 

The results indicate that information as to whether or not a deaf child 
can stand on one foot is valid in predicting whether or not he will be sensi- 
tive to the usual effects of rotation. It should be noted that correlation 
relates to concomitant variation rather than to cause and effect or even 
to identical underlying mechanisms. Using additional information as to the 
degree of hearing loss and as to whether the deafness was congenital or 
adventitious, Worchel and Dallenbach came to the conclusion that the two 
tests involve different physiological mechanisms: the rotation test, the 
semicircular canals; and the standing test, the macular organs of the utricle 
and saccule. 

Originally the authors divided success on the standing test into four 
groups or categories and the rotation test into three, as shown in Table 10.1. 
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TABLE 10.1. OBSERVED VARIABLES 
X = ROTATION TEST 


PERCENT 
FAILURE SUCCESS SUCCESS WITH AT LEAST 
Y = STANDING TEST | ALL TRIALS | SOME TRIALS | ALL TRIALS | SOME SUCCESS 
Immediate success 0 8 8 100 
Success after 
practice 6 5 12 74 
Failed: Average 
trial >4 seconds Ў 1 2 30 
Failed: Average 
trial <4 seconds 10 0 0 0 


The product-moment r between the two variables is .62. A high degree of 
Predictability of success on the rotation test from score on the standing 
test is indicated not only by the correlation coefficient but also by the per- 
Centages shown in the right-hand column. Such percentages could be the 
basis of an * expectancy chart," showing what percent of the cases in each 
Category of one variable fall into a stipulated category in the second vari- 
able. As in this instance, the category in the second variable may be con- 


Solidated from two or more categories. 


CONSOLIDATION INTO TWO DICHOTOMOUS VARIABLES 

In Table 10.2 both variables have been made dichotomous as success- 
failure, Sometimes, when numbers of cases within the several cells are few, 
Such a procedure results in a more accurate picture of the situation than 
When a number of steps are used with each variable. Here, however, it is 
done chiefly to illustrate the study of a pair of dichotomous variables. 


TA 
BLE 10.2. DICHOTOMIZED VARIABLES 
X = ROTATION TEST 


SUCCESS TOTAL 


Y = STANDING TEST | TOTAL FAILURE 
Success 33 39 
(a) | (b) 
(c) | (а) 
Failure 17 20 
TOTAL 23 36 59 


Th bum 
© Computation of ¢ (product-moment г) by Formula 9.16a for dichotomies 5 
7x32-(6x _ 

= V23 х 36 x 39 x 20 


bc — ad 68 


7 Via + o (+d) (a +b) (c d) 


$ 
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The fact that is higher than the r computed in Table 10.1 is surprising 
because, when for identical data broader step intervals are used, r’s 
tend to decrease. Actually, there are five other ways of dichotomizing the 
two variables of Table 10.1, and each yields а ф lower than that of Table 
10.2, as shown in Table 10.3. 


TABLE 10.3. ALTERNATE WAYS OF DICHOTOMIZING? 


0 16 8 8 19 20 
23 20 29 14 18 2 | 
ф = .49 $= .16 ф = 40 

13 36 27 22 

10 0 | 10 0 

ф = .57 ф = .35 


в Data from Table 10.1. 


Since the two sets of measurements are not interval scales, one is not 
justified in dichotomizing at any two points which happen to strike the 
fancy. Certainly there is no justification in dichotomizing at several pairs 
of points and then choosing the pair that happens to yield the highest 
numerical relationship. 

In dichotomizing variables, one should either cut at the points that most 
logically separate the resultant categories, or cut as close to the two medians 
as possible. Table 10.2 probably meets both these criteria better than any of 
those shown in Table 10.3. 

Table 10.2 is an example of the degree to which one dichotomous variable 
can be used to predict another. For two dichotomous variables, the tech- 
nique of a regression equation can be justified mathematically; but no pre- 
dicted score would correspond to one of the discrete values on a two-point 
scale. Rather, for all those "successful" on one scale, we should predict 
a fractional success score on the other; and for all those who “failed” on 
one scale, we should predict a fraction indicating failure on the other. 
Only when the predictor is a continuous variable do scores predicted from 
a regression equation make much sense. If the criterion is dichotomous, the 
predictors in continuous form are related to the probability of “success.” 

If the sample on which Table 10.2 is based is representative, our predic- 
tions would be accurate in most cases. By predicting “success” on a second 
variable for cases with “success” on the first, and “failure” on the second 
variable for cases with “failure” on the first (33 + 17)/59, or 85 percent, 
of the predictions would be expected to be correct and 15 percent incorrect. 
In 15 percent of the cases, then, one would expect to be completely wrong; 
but, of course, there is no way of identifying such cases in advance. 
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INCREASE IN CORRECT PLACEMENTS OVER CHANCE 


If a 2 x 2 (or fourfold) table is used for prediction, as exemplified in the 
Simpler scatter diagram from which a phi coefficient is computed, it is 
Pertinent to inquire how much better the predictions are than sheer 
guessing. 

When the four frequencies of the 2 x 2 table are labeled a, b, c and а, 
as in Table 10.4, the correct placements are b and c, and the incorrect 
placements are a and d. The method of estimating placement in cells on 
the basis of chance alone was discussed in Chapter 3 in connection with 
chi square. This method results in theoretical or expected cell frequencies 
that are proportional to both sets of marginal values, the f, and the f,. For 
any cell, the formula for the expected frequency is f, =f, SIN. 


TABLE 10.4. PREDICTION FROM A 2X 2 TABLE 


VARIABLE X 
VARIABLE Y "8% 1 f 
+ а b (a+b) 
= с а (с--а) 


f (a 4- à) 0-4 | Ncatbtcetd 


rect placements expected , 


When on ы: fi b the number of cori 
e subtracts from is (bc — ad)/N. 


іп that cell by chance, f, f./N or [ба + bb + d)]/N. the result i -¢ 
he number of correct placements in the entire diagram, in addition to 

those expected by chance, is exactly double this amount, or 2(bc — ad)/N. 
Frequency in any cell expected purely on а basis of chance, that is, so 

that cell frequencies will be proportional to marginal entries, is SAIN. 

7 The number of correct placements is (b + c). The theoretical number of 

increases in correct placements because of a positive relationship between 


the two dichotomous variables is 
(a+ Bb d), _(а+д(с+@ _ Abe — ad 
UT 0H " 
_ This formula may yield fractional values, since the theoretical frequen- 
Clés may be fractional. The number of increases of correct placements іп a 


Single cate to be (bc — ad)|N. 

Table йе» ned алы іп which both distributions are 
cut exactly in the middle. If the point of dichotomy is different for the two 
Variables, results differ somewhat, but the general picture remains more or 
less the same. 

It is seen from Table 10.5 that even а low corre 
Proportion of correct placements. 


b 


lation may increase the 
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TABLE 10.5. HYPOTHETICAL PREDICTIONS WITH PHI 
(EQUAL DICHOTOMIES, N = 100) 


INCREASE INCREASE IN 
FREQUENCIES* IN CORRECT CORRECT PLACEMENTS 
———————__ CORRECT PLACEMENTS IN UPPER CATEGORY 
a b с 4 PLACEMENTS OVER CHANCE OVER CHANCE PHI 
25 25 25 25 50 0 0 .00 
24 26 26 24 52 2 1 04 
23 27 27 23 54 4 2 .08 
22 28 28 2 56 6 3 42 
21 29 29 21 58 8 4 16 
20 30 30 20 60 10 5 20 
19 31 31 19 62 12 6 .24 
18 32 32 18 64 14 7 .28 
I7 33 33 17 66 16 8 32 
16 34 34 16 68 18 9 136 
15 35 35 15 70 20 10 40 
14 36 36 14 72 22 11 44 
13 37 37 13 74 24 12 48 
12 38 38 12 76 26 13 52 
11 39 39 11 78 28 14 56 
10 40 40 10 80 30 15 60 
9 41 4 9 82 32 16 64 
8 42 42 8 84 34 17 68 
7 43 43 7 86 36 18 72 
6 44 44 6 88 38 19 76 
5 45 45 5 90 40 20 80 
4 46 46 4 92 42 21 84 
3 47 47 3 94 44 22 88 
2 48 48 2 96 46 23 92 
1 49 49 | 98 48 24 96 
0 50 50 0 100 50 25 1.00 
“Іп the 2 x 2 diagram 
- + 
+ a | b 
нез | c d 


b and c represent numbers of correct placements; a and d, numbers of incorrect placements. 


Consider a situation in which 50 percent of the individuals working on a 
job were considered successful and for which a test was available that, 
when scored dichotomously, yielded a phi of .40 as a validity coefficient. 
Suppose we now hire only those who are above the median on the test and 
that work standards are not changed. It would now be expected that 35 
instead of 25 of each 50 hired would be successful. Thus the percentage of 
successful employees would go from 50 to 70 percent. The increase in the 
efficiency of the organization as a result of better selection could thus be 
considerable. 
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PREDICTION WITH A CONTINUOUS VARIABLE 


The method of establishing a basis for predicting one continuous variable 
from another is illustrated in Example 10.1. Within a sample of cases the 
relationship between the variables is summarized by a coefficient of 
Correlation. _ 

When only two variables are concerned, the predicted score, 20 or Хо, is 
merely a linear transformation of the predictor variable and hence cor- 
relates perfectly with it. When there is only a single predictor and we are 
Interested merely in ranking the cases with respect to the probable criterion 
Scores, the following have identical utility: 


І. Scores as predicted from a regression equation; 
2. Any linear transformation of predictor scores; and 
3. The original predictor scores. 

The value of the regression equation in this instance lies only in finding 
Predicted scores that numerically deviate as little as possible from actual 
Criterion values, using a procedure that makes the sum of the squares of 
the differences between actual and predicted scores as small as possible. 

It is apparent that, as the correlation increases, errors in prediction de- 
Crease, This is shown by the formula for the standard deviation of the dis. 
сгерапсіев between observed scores and predicted scores (Formula 6.8): 


So. soV1 = ro 7 
While 59., can be called the standard error of estimate, OF Sest о» theoretic- 
ally it requires correction before it can be applied to cases not yet observed. 
It can be shown that the appropriate correction requires division of the 
Sum of the squares of the errors by the number of degrees of freedom D 
this case, N — 2) instead of М. Accordingly, the formula for the standar 


error of estimate (Formula 6.8) becomes 


N —ma (6.82) 
-— Toi 


With 30 cases, seq о by Formula 6.8a 15 abo 
Without the correction; whereas, with very. 
Negligible. However, when N is small and it is ne 
Statements about the zone in which a criterion sco 


the use of the correction is essential. 


ut 3 percent larger than it is 
large N, the correction is 
cessary to make precise 
re is likely to be found, 


PREDICTING A DICHOTOMIZED CRITERION 

led the coefficient of alienation, which 
dard error of estimate, Ses о, is the 
m about the effectiveness of 


The factor wl — ғо15, sometimes cal 

арреагѕ in the formula for the standard er 
asis for prevalent (but unnecessary) pessimis 

Prediction in psychology and education. 
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It is true that only when r reaches .707 is the variance of the errors of 
prediction reduced to half of the original criterion variance. At that point, 
the standard deviation of the errors is .707 of the standard deviation of the 
criterion. Since in psychology very few correlations between even a team 
of predictors and a criterion reach .707, it might appear that prediction by 
psychological methods is ineffective. Such is not the case. 

In a practical situation, such as selecting employees, the usefulness of a 
psychological instrument depends upon three independent factors: 


1. The proportion of cases considered "satisfactory" without the use of 
psychological methods ; 

2. The selection ratio, or proportion of individuals selected for the job or 
for the training program; and 

3. The validity of the instrument in the original range of talent. 


Grading individuals as “satisfactory” or “ unsatisfactory” means either а 
dichotomous criterion or dichotomizing a continuous criterion. Obviously, 
if only a small proportion of unselected individuals is satisfactory, selection 
techniques have more possibility of being helpful than if a large proportion 
is satisfactory. The principles now under discussion apply, whether the 
criterion is dichotomous or is graded in categories, but they are easier to 
demonstrate for a dichotomous criterion. 

If all applicants must be selected, then of course psychological instru- 
ments cannot improve their quality. However, as the selection ratio de- 
crcases, provided the instrument has at least some validity, the quality of 
selected personnel improves. 

The importance of the third factor, the validity (or correlation between 
the predictor and the criterion in the original group) is obvious, since 
without some degree of correlation, prediction is no better than a guess. 

The artificial data of Example 10.1 may be considered as showing the 
relationship between a predictor, Хі, and job success, Xo, prior to the use 
of the predictor for selection. In the computations in Table 10.10, it will 
be noted, however, that the validity is high, .64, which is greater than that 
often found in real situations. 

When an X score of 50 or more is taken as representing “satisfactory” 
individuals, the situation is summarized in the upper part of Table 10.6, 
which assumes that with greater selectivity, performance standards remain 
unchanged. It may be noted that when the proportion selected or “selec- 
tion ratio" becomes .77 (386 chosen out of 500), the proportion of “ satis- 
factory" individuals increases from .77 to .86. When the selection ratio 
decreases still further to .40 (201 out of 500 selected), there is a further in- 
crease in the proportion of satisfactory individuals to .97, while with a 
selection ratio of .11. one would expect all those chosen to be "'satisfac- 


tory." 


FORECASTING HUMAN BEHAVIOR 251 


TABLE 10.6. NUMBERS AND PROPORTIONS “SATISFACTORY” AND 
UNSATISFACTORY” WITH NO SELECTION AND WITH THREE CUT-OFF POINTS® 


CUT-OFF SCORES 


NO 
SELECTION X1—45  X1—55 Xı=65 


Number “Satisfactory”? 387 331 194 53 
Proportion “Satisfactory” УІ. 86 97 1.00 
Number “Unsatisfactory” 113 55 z 0 
Proportion “Unsatisfactory” 23 14 .03 .00 
TOTAL SELECTED 500 386 201 53 
SELECTION RATIO 1.00 477 40 ai 
CUT-OFF SCORES 

NO 

SELECTION Хі-45 Хі-55 X= 65 
Number “Satisfactory "е 201 189 139 46 
Proportion “Satisfactory” 40 49 69 87 
Number “Unsatisfactory” 299 197 62 7 
Proportion “Unsatisfactory” .60 51 31 13 
TOTAL SELECTED 500 386 201 53 

1.00 37 40 11 


SELECTION RATIO 


a 
Data from Example 10.1. 

2 , Qatisfüctory" defined as Xo score of 50 or more 

atisfactory"' defined as Хо score of 70 or more. 


If performance standards are raised so that, without selection, less than 
Its are shown in the lower 


half are considered satisfactory, expected results 1 s 
half of Table 10.6. With successive decreases 1n the selection ratio, a 
dramatic increase may be noted in the likelihood that a selected individual 


Will prove to be satisfactory. Without screening, and a criterion score of 70 
being considered as satisfactory, the individual's likelihood Br beag nes 
garded satisfactory is .40; but with a selection ratio of .11, his likelihood is 


More than twice as great, or .87. 


THE TAYLOR-RUSSELL TABLES 
Taylor and Russell (3) prepared tables, reproduced in part as Table 10.7 
Or nine different proportions of employees considered satisfactory, ranging 
from .10 to .90. Entry is by means of the validity coefficient before selec- 
tion (from .05 to .95) and the selection ratio (from .10 to .90). The pre- 
dicted proportion of employees considered satisfactory 1s thus obtained from 


three factors: 
1. Proportion considered satisfactory without selection; 


+ Correlation between predictor and criterion; and | 
Proportion chosen through the use of the selection device. 
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Since Table 10.6 is designed to illustrate the principle behind the Taylor- 
Russell tables, it is of interest to compare results. In the lower part of 
Table 10.6 the proportion satisfactory without selection is .40 and the 
үү is approximately .65. When the cut-off score is 55, the selection 
ete p these circumstances we expect the proportion satisfac- 
aoe the Table 10.7 for “Proportion of Employees Considered 
ik actory — .40" with a validity of .65 and selection ratio of .40, we find 
that the expected proportion of satisfactory employees is .67. Thus, find- 
Ings from the hypothetical scatter diagram and the Taylor-Russell tables 
are practically identical. 

This is really not surprising, 
oe 10.1, are constructed from the proportion o € 
: Tes regions of the scatter diagram under specified conditions, 

near relationship between criterion and predictor. 


since the Taylor-Russell tables, like 
f cases expected in 
including 


PREDICTION IN TERMS OF COST AND UTILITY 


телде way of looking at a scatter diagram representing the relationship 
etween a criterion and a predictor is in terms of “cost,” defined as the 
Percentage of satisfactory individuals rejected at a given cut-off point, and 

utility,” defined as the percentage of unsatisfactory individuals rejected. 
E. мыл of cost and utility in Table 10.8 summarizes the effective- 
Die o prediction under defined circumstances ; namely, the validity, the 
ie satisfactory without selection, and the selection ratio. Expec- 
sine is shown in a format rather different from the Taylor-Russell tables, 

ough the underlying phenomena are, of course, identical. 


T 
ABLE 10.8, EFFECTIVENESS OF SELECTION IN TERMS OF COST AND UTILITY? 


% SATIS- NO. % UNSATIS- 
TOTAL МО. SATIS- FACTORY UNSATIS- FACTORY 
cur. NO. RE- 96 RE- FACTORY REJECTED: FACTORY REJECTED. : 

“OFF JECTED JECTED REJECTED “созт” REJECTED UTILITY 
% 496 99.2 195 97.0 299 100.0 
65 480 96.0 183 91.0 297 99.3 
60 447 894 155 77.1 292 97.7 
55 386 772 117 58.2 269 90.0 
50 299 59.8 62 30.8 237 79.3 
45 201 40.2 26 12.9 175 58.5 
40 114 22.8 12 6.0 102 34.1 
35 53 10.6 2 1.0 51 171 
30 20 4.0 1 0.5 19 6.4 
6 12 0 0.0 6 2.0 


а 
Ба А 
уа from Example 10.1. А score of 70 or better in Xo is taken as satisfactory; N=201, satisfactory; 


=299, Unsatisfactory. 
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CUT-OFF POINTS AND CRITICAL POINTS 


In the application of psychological techniques to personnel selection, the 
cut-off point on the predictor variable is often established administratively. 
Someone in authority makes a decision as to the number of position 
vacancies to be filled or the number of students to be accepted. After con- 
sideration of the number of applicants and the distribution of the predic- 
tive scores, a point is established above which candidates are to be accepted 
and below which they are to be rejected. When all information is available 
at the time the decision is made, a firm cut-off point can be established and 
those with highest predictive scores can be selected. Often, however, in a 
rapidly changing situation, cut-off points are raised or lowered according 
to the supply of candidates and the needs of the organization for new per- 
sonnel, yielding a solution less than optimal. 

As contrasted with “cut-off,” the use of the term critical point implies an 
unvarying criterion. Sometimes a critical point can be established as the 
average predictive score corresponding to a criterion value defined as a 
certain standard of work. When the criterion is dichotomous, the critical 
score might be the point on the predictor variable at which the individual 
would have a specified probability of being “satisfactory” on the criterion. 

If, with the data of Example 10.1, we take an Хо score of 70 or more as 
being "satisfactory" and an X, score of 69 as “unsatisfactory,” the pre- 
dictor distributions for the two graphs would be as shown in Table 10.9. 

With real data and with a correlation as high as .64, regular increases in 
“percent satisfactory” from step to step would be expected. The reason 
for certain irregularities and approximately equal percentages in three pairs 
of steps is that, in the example, regression departs slightly from linearity. 
However, the results show reasonably well how a critical point can be set 


up. 


TABLE 10.9. NUMBERS OF SATISFACTORY AND UNSATISFACTORY CASES BY 
CATEGORIES? 


PERCENT 
UNSATISFACTORY SATISFACTORY SATISFACTORY 

Ху SCORES f f 

75-79 0 6 100 
70-74 2 12 86 
65-69 5 28 85 
60-64 23 38 62 
55-59 32 55 63 
50-54 62 36 37 
45-49 73 14 16 
40-44 51 10 16 
35-39 32 1 3 
30-34 13 1 1 
25-29 6 0 0 


а Data from Example 10.1. Critical point, Xo—70. 
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The same data are presented graphically in Fig. 10.3. If the critical point 
is defined as the predictor score at which the probability of being satisfac- 
tory is .50, the point at which the two distributions cross locates the score. 
The histogram does not permit precise location of the point, but it is ap- 
parent that an X, value of 55 would be approximately correct. 


N 
80 
70 
60 
50 
40 
30 
20 


10 


> 
49 
2 


FIG. 10.3. ESTABLISHMENT OF A CRITICAL POINT GRAPHICALLY. (DATA FROM 


EXAMPLE 10.1.) 
istributi ine) i “satisfactory ” individuals (Хо = 70 or more). 
Nery oc uses err ied к. Rr il ы individuals (Хо- oar m 
Critical point, at which probability of success goes from less than .50 to mor E 
18 approximately 55. 

There is, of course, no requirement that the critical score be кт 
at the point at which the probability becomes 50. It could be taken a any 
value of p, such as .40, or .60, or .75. In an applied testing situation, other 
Considerations might enter into the decision. A study might show that un- 
less the probability of success of candidates was .75 or better, the firm would 
be likely to lose money by hiring them. Accordingly, à probability of .75 


Might be used in establishing the critical score. 


CORRECTION FOR CHANGES IN RANGE 
ular X, score is not directly 


The accu ‘ting Xp from any partic 
i imbre f the array (X, column) in 


Indicated by ге, but rather by the variance 0 
Which the X, score appears. 
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The bivariate distribution in Example 10.1 has been constructed so that 
it is homoscedastic vertically, that is to say, the variances within the columns 
are almost precisely equal. Few bivariate distributions based on empirical 
psychological data are as homoscedastic vertically as this distribution. 
However, many are reasonably homoscedastic in two dimensions, as this 
one is not, since the variability in the horizontal arrays varies greatly. 


EXAMPLE 10.1 


PREDICTION WITHIN A BIVARIATE DISTRIBUTION 


In Table 10.10 the regression for predicting x’, from x’, is approximately linear, 
and the line connecting the means would extend about from the cell identified 
with 0 in the column at the left to the cell similarly identified in the column 
at the right. In each column, which corresponds to a single x’, value, the most 
probable value x’, is at the column mean. 

Variabilities in terms of coded scores are approximately equal. 

Тһе 0 cell in column (x^, = 10) is ten steps to the right and six steps above the 
0 cell in column (x', = 0). Since r has been shown to be the slope of the regres- 
sion line when variabilities are equalized, we can estimate r as 6/10, or .60, 
Which agrees quite well with the computed value of .64. Of course, only when 
variabilities are equal and regression is strictly linear can r be estimated in this 
fashion. In effect, the formulas for r convert original scores to z-scores, so that 
Zzyu[N is both the slope of the straight line of best fit and the correlation 
coefficient. 


Estimation of r from the variability around the regression line. 'The artificial 
bivariate distribution in Table 10.10 was constructed so as to be homoscedastic 
within columns. (Definitely, it is not homoscedastic within rows.) The variability 
within two columns (x', — 4 and X', — 6) may be taken as typical. The variance 
around the mean in terms of coded scores is 


T Р 
pr, is PE S-a (3) — 


N N 87 


This may be taken as an estimate of the partial variance of the criterion after 
Variance associated with the predictor has been removed, or (except for a cor- 


rection involving the degrees of freedom, as explained in the text) as the square 
of the standard error of estimate. 


The total variance of variable 0 in terms of coded scores is 3.808. 
From the discussion in Chapter 6, it would be expected that 


; T 3.808 — 2.161 
" Y. - 3.808 


This result again is close to the computed value of .64. 
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This is not a practical computation method, since it applies only when regres- 
sion is linear and when the bivariate distribution is homoscedastic around the 


regression line. 


Whenever we assume that the variance around the regression line (that 
is, the partial variance of the criterion after variance predictable from the 
independent variable has been removed) represents the error in prediction 
throughout the range, we assume homoscedasticity. 

When the symbol Б is used to indicate the variance around the regres- 
Sion line, its numerical size depends upon the units in which the criterion 
is measured and is equal to Vo(1 — 117). If the criterion is іп 2 scores with 
Variance of unity, as in multiple correlation, Vo. becomes (1 — 791°). 
Here, for convenience in computation, Vo. is used to mean the partial 
variance of the criterion in terms of coded scores. Exactly the same results 
Can be obtained by using all variances either in raw-score form or in z form. 


ASSUMPTION OF HOMOSCEDASTICITY 


On the assumption of homoscedasticity, inferences can be made as to how 


changes in range will affect the correlation coefficient. 
nges in range does not refer to 


As pointed out in Chapter 6, the term chai 
Шотан ош on variable or the other such that the range (as 
Measured by the difference between highest and lowest scores) changes 
numerically. Linear changes in scale values have, of course, no effect on r. 

owever, if at either end of a bivariate distribution, cases are added or 
taken away so that the range is increased or decreased, the magnitude of the 
r will change. M 

апайы A selection device that on experimental trial yields a correla- 
tion with the criterion of, say, .64. As the consequence of the miei ul 
à highly valid predictor, a decision is made henceforth to се у іп- 
dividuals with high scores. Provided criterion standards remain the same, 


"n 1 i lation in 
It will b n the device is validated again, the correla 
o touna ы hat observed in the original, or 


the restrict i duced from t 

ed range will be reduce ! 4 
Unrestricted, range. The amount of the reduction will be related to the 
degree of "e 
а he scatterplot of two variables 


When the sw i senting t 
arm of points repre E nin. ; : 
falls in a perfectly circular area (after variabilities are equalized), r is .00, 


Whereas the closer the points are to the regression line, the higher the r. 
The Scatterplot of Example 10.2 is the bivariate distribution of Example 
10.1 less the first four columns. When compared with its original, it will be 
Noticed that if the variabilities of the Xo and X, distributions were equalized 
again, the scatterplot would be more nearly circular than originally. As 
expected, the correlation for the data of Example 10.2 is less than for 


Example 10.1. It is .55. 
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EXAMPLE 10.2 


COMPUTATIONS FOR CORRECTION FOR CHANGE IN RANGE 


(Data are seven columns from Example 10.1.) 


x 

45- 50- 55- 60- 65- 70- 75- 

49 54 59 64 ө 74 79 
хіт-а, 

Xo хо-аФ 0 1 2 3 4 5 6| Ch т СУаһ 
110-119 9 1 1 2 4 17 21 
100-109 8 24 3 4 1 2] 1 15 71 

90-99 7 5 10 1 7 3 50 13 160 

80-89 6 4 1] 18 13 9 4 1 | 110 11 308 

70-79 5 10 20 23 15 7 3 1 | 189 9 468 

60-69 4 18 26 18 13 4 1 269 7 590 

50-59 3 23 20 10 7 1 1 33 5 590 

40-49 2 18 п 4 3 367 3 660 

30-39 1 10 5 382 1 688 

20-29 0 4 386 693 

Cf 386 299 201 114 53 20 6 
1 % — 5 7 9 и 


m 
CXd, fo 1720 1459 1067 632 327 129 45 


Computations 
Ух =ECh = 1720 
Ext = ЎтСу, = 8920 
V, Xx (жау _ 8920 (em 


2 
N N 386 ES = 3.2532 (in terms of x^;) 
Ух = ECf, = 693 
Ex',? = Xmcf, = 2089 


. 2089 (693 
+ NN s 


Ххх = XCYd, fo, = CEd,fj, = 3659 
LLL NExw к - 
VNEX è — Qu УМУ 2 —( - 


2 
) = 2.1887 (in terms of x^) 


"oi 


Computation of r. This example shows again a procedure for computing r 
from a scatter diagram, already demonstrated in Examples 6.3 and 10.1. Data 
are exactly the same as for Example 10.1, except that the first four columns have 
been omitted and both variables have been recoded. 

Knowledge of the variances and the correlation of the truncated bivariate 


distribution permit an empirical study of correction for changes in range, as 
explained in the text. 


FORECASTING HUMAN BEHAVIOR 261 


CORRECTION ON THE INDEPENDE 
NT VARIABLE WHEN CRITERI 
VARIANCES ARE KNOWN ii 
zn correction is for restriction rather than extension of range, but 
he principle is identical. In the basic formulas there is merely a change in 
Which correlation is known and which is unknown. 
The following notation is applicable: 


v; = variance of a variable in lesser range. 
V, — variance of the same variable in a greater range that includes 


the lesser range. 
г = correlation in the lesser range. 
Ri; = correlation in the greater range 
multiple R). 
It is assumed that the partial variance o 
LM pega from the independent variable 
oe in both ranges. This assumption is that v9.1 = Vo.1. This is clearly true 
: hen the vertical arrays have equal variances, as in Example 10.1, and may 
е true under other circumstances. If vo. = Уол» then, expanding each, 
ге(1— гоз”) = Voll — Rox”) 
nts are known, this equation can be readily 


(in this case R does not refer to 


f the criterion after the portion 
has been partialed out is iden- 


When three of the four consta 
Solved for the fourth. 
Solving for Ro, yields 


v 
Roi =f -у, = rai? 


practice. When the independent vari- 
its two variances than we are 
Nevertheless, to test the ap- 
f Example 10.2. Then 


(10.1) 


NW. formula is not very useful in 
eps is restricted, we are more likely to know 

now the two variances of the criterion. 
Proach, Formula 10.1 is applied to the data o 


3.2532 _ 
a а Pe — (550 = 04 
Ros | 5.8080 Г (.55)] 


а result identical with Ro, as computed in Example 10.1. 


с 
ORRECTION FOR RESTRICTION WHEN PREDICTOR VARIANCES ARE KNOWN 


foe predictor instead of criterion variances are known, corrections must 
Diner on the partial variances of the predictor. However, it is clear that 
an if the bivariate distribution were homoscedastie horizontally before 
с it would not be so subsequently. Hence, it cannot be assumed 
А 1.0 = Uo. Actually, V.o iS necessarily greater than 0.0. 
able Plausible assumption is that the change in the proportion of predict- 
variance (г? over Roi”) will be directly related to the change in the 
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proportion of unpredicted variance around the regression line when vari- 
able 1 is predicted from variable 0. Thus 


2 
Tor _ 010 01(1— roi?) 


Ro Vio Vi(1 — Ros?) 


Again, the equation has four constants, and when three are known, the 
fourth can be determined, Solving for Ro, yields 


SER NUTS (102) 


2 н. 2 
9, — 017017 + Virg, 


while solving for Роу gives 


" -j 01Ко? (10.3) 
95 V, — Ко? + v, Ro? 


Applying Formula 10.2 to the data of Example 10.3, 


Bs | 4.0080(.55)? -— 
рен 2.1887 — 2.1887(.55)? + 4.0080(.55)2 ` 


The result is a little higher than the value of .64 computed in Example 
10.1, but nevertheless appears to be good approximation. If the horizontal 
arrays had variances that were more nearly uniform, it is likely that the 
approximation would be closer. 

There is an important implication of the discussion of the effects of 
changes in range. When psychological selection techniques are applied in 
real situations, criterion scores will no longer be available for poor per- 
formers eliminated because of low test scores, and the criterion in this 
restricted range will automatically yield lower validities. Nevertheless the 
devices may remain just as useful. Only if the decrease is greater than anti- 


cipated should there be concern that the selection technique may have lost 
Some of its effectiveness. 


PREDICTION WITH A TEAM OF VARIABLES 


MULTIPLE REGRESSION EQUATION 


basic phenomena remain identical, 


1 Mathematically, 
that of a plane. However. 
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The statistical method that is, explicitly or implicitly, behind most efforts 
to forecast a criterion from a group of predictors is multiple correlation, 
described in Chapter 7. There it was noted that multiple R is a product- 
moment correlation between the criterion on the one hand and a weighted 
sum of predictors on the other, each weighted so that (in the sample in 
which R is found) the correlation is as high as possible. It is found by a for- 
ward solution, which reduces step by step the variance of the criterion to a 
partial variance from which all variation predictable from the “іпдереп- 
dent" variables has been removed. From this partial variance, Vo.12...m the 
multiple, Ro5. ,;, is easily determined. | : 

A back solution (which actually solves a set of n simultaneous linear 
equations in п unknowns) yields the п betas. For each of the N cases, the 
betas can be used as multipliers of the z scores опт predictors, yielding a 
Set of values of 2, which have a correlation of Ro(12, 5) with the criterion 
Zo. Thus the regression equation in z form is 


Zo = Bo1.23 ..nZ1 + Pozas ..nZ2 + +++ + Вова... (ө-1)2а (1.1а) 


This equation is seldom useful in practical prediction because it 15 appli- 
cable only to predictors with equal standard deviations. However, to change 
the standard deviation of any variable to 1.00, all that is needed is to divide 
each value by the observed standard deviation. Thus the correlation be- 
tween & о, à weighted sum of raw scores, and the criterion is the multiple 
if each raw score is divided by the standard deviation of the variable and 
multiplied by the appropriate beta. This yields the following Ше 
Summing raw scores so as to produce а composite with maximum correla- 


tion with a criterion: 


X 
Х' ы A РИ 2021 (04) 
Yom Boras an а Вози % + Bos12..(07 0 á 


, The mean of Х7 can be shown to be Хр ;M;/s;) and the standard 5. 
tion Ко ау. With this information, Х'о can be transformed ce series М 
Predictive scores in standard score form with assigned mean anc assigne 


Standard deviation. 


RAW-SCORE REGRESSION EQUATION 
g exactly the same correlation 


the regression equation. It can 
rm merely by substituting for 
(X; = M))/s;. Thus, sub- 


Slightly less convenient to use and yieldin 
With the criterion is the **raw-score form” of 
be conveniently derived from the z score fo 
Sach z its equivalent in raw-score terms, namely, 
Stituting in (Formula 7.1), 
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Хо-М X,—M Х,- М, 
2 “= бизэ. : JL 


So Sy 52 


eM (10.5) 


Poison = 


We now multiply both sides by the standard deviation of the criterion 
So. Each beta is thus multiplied by so/s;, s; being the standard deviation of 
the predictor. The resultant weights are called 5 weights and are written 
with the same subscripts as the original betas. 

At the same time we can: 


1. Move (— Мо) to the right-hand side of the equation with sign changed; 
2. Collect from each term the constant portion, which is (-ВМю/5); 
апа 


3. Call the sum of all the constant terms K. 


The equation then becomes 
%- bo1.23... Ха + різ «Ха Dona (n-1)Xn+K (10.6) 


In this equation, the value of K, which is [Mo — s9£(B;M;/s;)], serves to 
adjust the predicted scores so that their mean is My. The standard devia- 
tion of X, is s; Rot. y. 

The raw-score regression equation yields predicted scores that are as 
close as possible to the original criterion scores. It is therefore of interest 


when the scale used for the criterion needs to be reflected in the predicted 
Scores. 


MODIFICATIONS OF THE MULTIPLE CORRELATION TECHNIQUE 


For any sample there is a unique set of betas for a group of predictors, 
yielding a maximum value of the multiple R. However, there are various 
modifications of the regression weights that often yield a correlation be- 
tween a weighted score and a criterion that is almost as high as R. Regres- 
sion weights can often be modified considerably without lowering the 
correlation appreciably. 

One convenient modification is to round the weights so that they become 
single-digit integers approximately proportional to the regression coef- 
ficients. If there is no need to equate the mean of the predictive scores to 
the criterion mean, the simplest procedure is to divide each beta by the 
standard deviation of the predictor to which the beta is to be applied, and 
then choose a set of integers that has a high relationship with these ratios. 


Since each beta has been divided by the standard deviation of the predictor, 
the new weights can be applied to raw scores. 
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Sometimes it is appropriate to eliminate from the predictive equation 
the variables with negative betas. In theory, a “suppressor” variable with 
a negative regression weight (already described in Chapter 7) can add con- 
siderably to the multiple. When a “suppressor” can logically be anticipa- 
ted, there may be good reason for using it. However, the slight negative 
betas that occasionally appear in regression equations without logical 
explanation probably represent variance pretty well duplicated by other 
Predictors, and these variables might well be eliminated. Also, there is 
Probably more justification for using negative weights with biographical 
and personality variables than with aptitude tests on which all candidates 
are expected to perform to capacity. 

Generally speaking, as discussed in Chapter 
variables and the smaller the number of cases, the more * shrinkage’ 
be expected. In establishing a regression equation on a particular sample, 
all errors implicit in the measurement are used in the calculation of the line 
of best fit. Consequently, the regression line fits better in the sample than 
Would be expected in the population. With two or three predictors, all of 
which are highly reliable, the shrinkage may not be serious. The shrinkage 
becomes large when relatively unreliable data, such as item information, 
are used. In this case the error in the various intercorrelations is com- 


Pounded. 


IMPROVEMENT OF PREDICTION FROM A MULTIPLE REGRESSION EQUATION 
ore of the criterion 
of regression equa- 
dictor variables. As 
correlations tend to 


7, the larger the number of 
” is to 


In addition to developing predictors that measure m 
Variance, there are three ways to improve the validity 
aoe The first is to improve the reliability of the pre 
€ reliability of these variables is increased, their inter! 
become stable, and stable intercorrelations improve the stability of the 
regression equation. The second principle is to use large samples in estab- 
lishing the regression equation. Here again the direct effect is on the stability 
Of zero-order r's. Reliability of constituent variables remaining the same, 
the reliability of a composite is increased as the number of observations 
grows larger. The third principle involves the number of predictor variables. 
As their number increases, there is more chance for the capitalization on 
Chance error. Hence it is a good general plan to use as few variables as 
Possible, consistent with reaching a multiple R somewhat close to the 
Maximum validity. In the typical situation in applied psychology, little is 
generally gained from using more than five or six predictors. These may be 
Selected by the modification of the Wherry-Doolittle technique described 
in Chapter 7. 
Жү; the amount of “shrinkage 
Sonnel psychologists have generally foun 
cross-validation,” that is, trying out in 


» in a multiple can be anticipated, per- 
d that there is no substitute for 
a completely new sample any 
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multiple regression technique developed in an observed sample. Only when 
a technique involving maximization holds up repeatedly on new samples 
does it become worthy of confidence. With multiple R it is all too easy to 
maximize a relationship within a sample without being aware of the degree 
to which there has been capitalization of error. 


MULTIPLE CUT-OFF 


In prediction with a number of variables, the use of multiple cut-offs in- 
stead of multiple correlation is occasionally advocated. In practical situ- 
ations, multiple cut-offs are often used, as when candidates fora police force 
are acceptable only when they fall within specified limits of age, height, and 
weight, and meet specified requirements as to physical condition and resi- 
dence. A conviction for a felony or even a misdemeanor may be disqualify- 
ing for police work. Cut-offs such as these are generally established adminis- 
tratively rather than by validation. The theory of multiple cut-off is that 
deficiency in one characteristic cannot be compensated for by excellence in 
another. 

Multiple correlation and a series of multiple cut-offs used to select the 
same number of individuals from the same parent sample would ordinarily 
select somewhat differently. However, if both procedures were based on 
empirical validities, the overlap would be great. In practice, a series of 
multiple cut-offs has seldom, if ever, been demonstrated to predict better 
than multiple R. The multiple cut-off procedure would probably work better 


than multiple R with a number of predictors related to the criterion in 
curvilinear fashion. 


SUMMARY 
Psychological prediction involves two Steps: 
1. Determination of a relationship within an observed sample; and 


2. Application of knowledge of this relationship to new cases. 


Obviously, 
unless the ге 
ineffective, 


unless the sample is representative of a wider population and 
lationship is stable from sample to sample, prediction is 


A predictive system can be expressed as a graph (“expectancy chart”), 
as @ scatter diagram, or as a mathematical function. In psychological 


Statistics the function is most often a straight line, as in a regression 
equation. 


Predictions of mean scores (** 
accurate than predictions of Sc 
however, is in the way results a 

When an investigator attemp 
psychological methods appear 


group predictions") are numerically more 
ores for individuals. The main difference, 
re summarized. 

ts to predict a continuous criterion exactly, 
to involve a large proportion of error. In 
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real-life situations, prediction of a criterion in broad categories is ordinarily 
sufficient. Here, even low correlations are useful. 

The utility of a relationship is to be judged not only by the numerical 
size of the correlation, but also by the way the criterion is categorized and 
by the critical point used in selection. 

The numerical size of the correlation coefficient changes greatly with 
variation in the “range of talent" without affecting the efficiency of the 


observed relationship in making predictions. 


EXERCISES 


1. Using the data of Table 6.4, develop an expectancy chart showing the pre- 
diction of reading rate from reading comprehension. One way of doing this 
would be to find the percentage of cases with reading rate scores of, say, 
56 or more, for each of the following groups on the reading comprehension 
test: 70-99, 100-129, 130-159, 160-189, and 190-219. 


2. Again using the data of Table 6.4, try out the usefulness of Formula 10.2 
for correction for restriction of range. Using only cases with X scores of 
140 or more, find Tey, the correlation in the range restricted on the predictor 
Variable. Using this correlation and the variance of X in both restricted and 
unrestricted range, compare R,, as estimated with the computed rzy of .63. 


3. A dichotomous criterion is met by 40 percent of a group of 100 cases. The ф 
Coefficient between a dichotomous predictor, with 60 percent in the upper 
Category, and the criterion is .50. 


(a) Construct the diagram showing this relationship. " | 
(b) For another group of 100 cases with similar divisions into upper and 


lower categories, construct the diagram for ф = .00. 


4. Prior to the introduction of a systematic selection program in a certain 
Company, the proportion of salesmen considered successul was 45. А Pep 
battery with a validity of .25 was introduced, using a selection ratio of .35. 
From the Taylor-Russell tables (3) or from Table 10.7, estimate the propor- 
tion of selected individuals who would be successful if the criterion remains 
unchanged. 

deviation of 12. The correlation between 

What is the expected standard deviation 

al criterion values? 


* A criterion variable has a standard 
the predictor and the criterion is .60. 
Of the differences between predicted scores and actu; 


+ Given the following regression equation: 


Zo = 202, + -3922 + 1923 + .0424 


and means and standard deviations as follows: 
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VARIABLE М S.D. 
Xo 3 1 
X 8 2 
Ха 50 8 
Ха 550 120 
Xa 500 100 


develop the regression equation in raw-score form. 


7. Prepare a diagram to illustrate the difference between selection by a single, 
weighted combination of two predictors (as in multiple correlation) and the 
use of two separate cutting scores for the same two predictors. On a hypo- 
thetical bivariate distribution indicate which cases selected by one technique 
would be rejected by the other, and vice versa, overall selection rate being 
the same for the two methods. 


8. When the intercorrelations of three variables in a restricted range are known 
together with the variances of variable 2 in the restricted range (v2) and the 
unrestricted range (V2), a Pearson formula (2) for the correlation between 
variables 0 and 1 in the unrestricted range may be written as 

боо + rosis (V — 05) 
Уә гоз? (Va — v2) Vo, tons — о) 


Ro 


If the intercorrelations of the three variables in a restricted range are all .40 
and if v, — 100 and V = 140, what is the estimate of 8,7? 
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PROBABILITY 
AND THE 
NORMAL CURVE 


11 


THE NATURE OF THEORETICAL DISTRIBUTIONS 


SIMPLE PROBABILITY 
Que chief function of a theoretical distribution i 
deter 18 to say, a finding or an observation) in terms 0 
mined on the basis of a set of stated conditions. | 
Under a group of hypotheses or conditions, one тау deduce a list of all 
Possible events in a given domain, together with their relative frequency, 
9n the basis of chance alone. It then becomes possible, for each actual 


*vent in the domain, to find its probable frequency (within a more or less 
stated as a proportion of 1.000. 


ie. A die has six faces, with values from 
1 condition would be that each face is 


s to evaluate an event 
f its probable rarity, 


wee Such as a 5 falling uppermost, 

to b. ог out of a large number of throws, 

a е 5. The probability of throwing à 
S 1667, 

of pully the probability of a single event in a 

іше interest. In the case of the throw of a die, 


one-sixth of them can be expected 
5 on any given cast can be stated 


series of discrete events is 
for example, we are more 
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likely to be interested in the probability of throwing a 5 or higher than in 
the probability of throwing a 5. Accordingly, we should add to the 
probability of throwing a 5 (namely, .1667) the probability of throwing a 6, 
since 6 is the only higher value. The probability of throwing either a 5 or 
a 6 is 2/6 of all the possibilities, each of which is taken as equally likely to 
occur. Accordingly, the probability of throwing a 5 or higher, with one 
die and with a single throw, is .3333. 

For a die with a different number on each face, the probability for each 
of the six faces is exactly the same. Consider a somewhat different situ- 
ation: a die with one face with a value of 1 ; two faces, each with a value of 
2; two faces, each with a value of 3; and one face with a value of 4. The 
probabilities on any single throw are now: 


DIE FACE PROBABILITY 
1 -1667 
2 3333 
3 3333 
4 1667 


Because of greater numbers of 2’s and 3's than 17 or 4's, the distribu- 
tion is no longer composed of equal theoretical frequencies, but is humped 


in the middle. With theoretical N's of 100 and 300, the following distribu- 
tions are expected: 


THEORETICAL THEORETICAL 
FREQUENCIES FREQUENCIES 


VALUE (N — 100) (N — 300) 
4 17 50 
3 33 100 
2 33 100 
1 17 50 


The frequencies for N = 100 have been rounded to the nearest whole 
number but, except for the decimal point, these frequencies correspond to 
probabilities that add up to 1.0000. For convenience, theoretical fre- 
quencies usually add to 100, or to 1000, or to 10,000, while probabilities 
add to unity. It is readily seen that the basic principle is the same, even in 
the case of the column in which the theoretical frequencies sum to 300. 

Probabilities provide numerical values that reflect the degree of truth of 
certain statements. When the probability of an event is .00, it is certainly 
untrue that the event will occur. When the probability of an event is 1.00, 
it is certainly true that the event will occur. Thus, with a single toss of a 
die with faces valued from 1 to 6, inclusive, the probability of throwing a 


7 or greater is .00, and the probability of throwing some number greater 
than 0 and less than 7 is 1.00. 
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COMPOUND PROBABILITY 
Consider two independent events such that the occurrence of one has no 
effect on the occurrence of the other. Let each event be such that either it 
occurs or does not occur. Let probability of one event occurring be p 
ч probability of its nonoccurrence be qı. Similarly, with the seeond 
А et the probability of its occurrence be p; and the probability of its 
feat peres be 92. It is understood that in both cases (p + 4) = 1.00, 
fies p event is certain either to occur or not to occur. The probabili- 
e first event occurring or not occurring can be represented as 


ру + 41 = 1.00 (11.1) 


and of the second event 


pa + qz = 1.00 (11.2) 


The joint probabilities, as will shortly appear appropriate, may be 


found by multiplying Eq. 11.1 by Eq. 11.2. Thus, 


(pi + qiXpa + d2) = PiPa + P192 + P241 + 4192 (11.3) 


= now seen that, with two independent events, the probability of both 
dite we is the product of their separate probabilities, p р. The proba- 
е. бы the first event occurring without the second is P142 and of the 
sitio without the first is р4,. Finally, the simultaneous probability of 

ег occurring 15 4192- 


ee рү, the probability of t 
ility of the second event is .7. Accordingly, di 


tuti i 
5н of these values in Formula 11.3 yields a joi 
Ompound probability. 


he first event, is .4, and p2, the proba- 
=.6 and 4; = .3. Substi- 


nt distribution, based on 


рур; = -4* 7=.28 


рій: = 4 Х 32.12 
рад. = 7х .6 = 42 
4192 = -6 X 3-.18 

TOTAL 1.00 


nonoccurrence add to 1.00, and 


All probabilities of occurrence and 
he four different possibilities. 


E fractions of 1.00 are allotted to t 
he same procedure can be applied to developing the theoretical distri- 


buti ; 

i for a true-false test of two items. Let us state, as a basic condition 

br which the distribution is developed, that success on each item comes 
irely by chance, and that for item 1, p, =.5 and q, =.5; and also for 
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item 2, p; = .5 and g, = .5. Applying Formula 11.3 again, we have 
Pip2 = .5 x .5 = .25 
Dido = .5 x .5 = .25 
P29, = .5 х .5 = .25 
dido = .5 х .5 = .25 


It now appears that p, p; represents the probability of attaining a score 
of 2, entirely by chance. Similarly, p,q, and D34, represent the probabilities 
of attaining a score of 1. Since, in scoring psychological tests, we generally 
do not preserve the information as to the specific items answered correctly, 
these two categories can be consolidated. The final term, 9,42, is the 
probability of failure on both items. Accordingly, the probabilities are 


SCORE PROBABILITY 
2 25 
1 50 
0 25 


This is a theoretical distribution. To be sure, it is of limited value for 
testing hypotheses, but an important principle has been demonstrated; 
namely, that by the principles of mathematical probability we can find a 
distribution expected when, within a certain framework, only chance is 
operating. 


TESTING A HYPOTHESIS WITH A CHANCE DISTRIBUTION 


We shall now develop a chance distribution and then use it to test a hy- 
pothesis, 5 

Consider a psychological test of six items, each with five choices. If a 
large group of subjects answers the six items entirely by chance, with no 
knowledge whatever of the subject matter, with each choice of each item 
equally likely to be chosen but with only one answer recorded for each 
item, what would be the distribution of the scores ? 

This question can be answered by the use of (p + 4) raised to the sixth 
power, since there are six independent events: the six items. This is in 
contrast with the case just described with two independent events. Since 
there are five choices here, only one of which is correct, the probability of 
Success on any one item is 1/5, or 20. Accordingly, on any item, the 
probability of failure (denoted as 4) is .80. We could, if we wished, keep 
the p with their subscripts (such as Рі and p;) distinct as we multiplied 
repeatedly by (p; + g;). But it is simpler to raise (p + 4) to the sixth power, 
using the exponent of p, which in successive terms is 6, 5, and so on down to 
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zero, to indicate the number of coinciding successes. From each term the 
probability of the score corresponding to the exponent can then be 
calculated. 

We can expand (р + 4% by the conventional binomial expansion. When 
(р + q) is expanded to the nth power, there are (п + 1) terms. In successive 
terms, exponents of p decrease from л to 0; and the exponents of q increase 
from 0 to л. (In elementary algebra it is noted that any quantity except 0 
raised to the zero power is 1. When terms include p? or q°, there is no need 
to write p? or 4° explicitly. Accordingly, the first term is written as p" and 
the last term as 47.) 

The (n + 1) coefficients are the combinations of n things, 0, 1 --- n at a 
time, a series that is identical and symmetrical with the combinations of n 
things n, (n — 1) -.- 0 at a time. The combination of n things r at a time, 
Which can be written (") is always n(n — D(n — 2) to r terms, divided by 


1-2-3 to r terms. This product of r successive integers beginning with 1 is 
factorial r (written as r!), while n(n — 1)(n — 2) to r terms is equal to 
1 0 (or 0!) is 1 (as is factorial 


a — r)! Hence (?) is n!/(n — r)! r! Factoria 

Accordingly, (5) is n!/n!, or 1, as is (1), so that the coefficient of the first 
term and the coefficient of the last term are both 1 (and need not be written 
explicitly), 

By the above principles, 

(p+ 4)% = р + 6p?q + 15р“? 4 20р?д? $ 15p?q* + 6pq? 44% 
Formula 11.4 yields the probabilities for six independent conditions. 
tical distribution 


Substitution of .20 for p and .80 for q results in the theoretic 
Of scores when there are six independent items with a likelihood of .20 of 


8 Е чае х 
Uccess on each item. The distribution is as follows: 


(11.4) 


SCORE PROBABILITY 

6 (20) = .000064 
5 6(.20)5(.80) = .001536 
4 15(.20)4(.80)? = .015360 
3 20(,20)%(.80)8 = .081920 
2 15(20)%(80) = .245760 
1 6(20)(80)5 = .393216 
0 ((80)8 = .262144 


TOTAL 1.000000 

Again the probabilities sum to unity. The most common chance scores 
are 0, 1, ang 2, with 1 having the greatest frequency. Chance scores of 3 
Occur about 8 percent of the time, while chance scores of 4, 5, and 6 are 


Telatively rare. 
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Suppose a classroom test consists of six items, all of the five-choice type. 
Suppose also that a member of the class has a score of 5. We assume that 
the test has been so well constructed that anyone who knew nothing of the 
subject matter would obtain a chance score. From the table it can be noted 
that the probability of attaining a score of 5 is .001536 and of attaining a 
score of 6 is .000064. 

The hypothesis set up is, “The score of this student is the result of 
chance, not knowledge.” In testing the hypothesis, it seems appropriate 
to lump together the probabilities of the attained score and any higher 
Score or scores. It thus appears that there are only 16 chances in 10,000 of a 
score of 5 or 6 occurring purely by chance (p = .0016). Accordingly, it is 
reasonable to conclude that the student’s score of 5 is not a chance 
happening. 

By multiplying by N, probabilities can be translated into expected fre- 
quencies. If the probabilities of the preceding distribution are multiplied 
by 1000 and rounded to the nearest whole number, the following theoretical 
frequencies (f,) are obtained: 


SCORE Ж 

6 0 
5 2 
4 15 
3 82 
2 246 
1 393 
0 262 


TOTAL 1000— М 


The sum of the expected frequencies is supposed to be N. However, 
because of rounding error, integral expected frequencies may not sum 
precisely to N. For example, if N in the preceding example were 500 
instead of 1000, Zf, would be 501. Also, the expected frequency of 0 for a , 
Score of 6 is not precisely 0, but rather a very small number. 


THE NORMAL PROBABILITY CURVE 


The binomial expansion, as exemplified in raising (p + q) to the nth power, 
is related to the normal probability curve, which has widespread appli- 
cations in statistics and which is useful in that in many (but not all) types 


of psychological measurement the following characteristics are observed: 


1. Scores or measurements ten 


ый d to cluster close to the mean, with small 
deviations more numerous t 


han large deviations. 


2. The distribution of measurements tends to be symmetrical. 
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3. The greater the distance from the mean, the rarer the score. Beyond 
certain limits above and below the mean, scores almost never occur. 


To facilitate deductions from a collection of observed data, a mathe- 
matical model of such a distribution is needed. When the observations are 
such that the mathematical model is applicable, then properties of the 
mathematical curve can be taken as properties of the parent population 
represented by the observed sample. 

If a phenomenon is caused by a large numb 
equally influential and each equally likely to 
distribution of the resultant variable can be repre 
expansion (p + 4)" when p = 4. However, if n is large, 
of [n!/(n — r!)r!]p"""d', representing frequencies of 
become burdensome to evaluate. Tables for this 
distribution curve would also be too extensive to publish. 

The normal curve is a close approximation to the binomial. In effect, it is 
generated by taking p equal to q, expanding without limit, and evaluating 
the formula for any term through the use of an approximation for factorials 
developed by Stirling in the eighteenth century. In this expansion of 
(p + 9)", both p and 4 are .5 and n approaches infinity. The curve was 
Originally discovered by DeMoivre, rediscovered by Gauss and by 
Laplace, and applied to anthropometric variables by Quetelet and Galton. 
The formula for the curve gives its height or ordinate at any point on the 
base line, or abscissa, representing the values of the variable. 


er of factors, each of them 
be present or absent, the 
sented by the binomial 
the successive terms 
successive values, 
“point binomial” 


FORMULAS FOR THE NORMAL PROBABILITY FUNCTION 


The formula, the development of which is beyond the scope of this text, 


сап be written! as 
NE 


"m Ms (11.5) 
V= own 


n represented by Hx and standard 


The variable is denoted as X, with mea S 
es is N. Two mathematical con- 


deviation by o,. The total number of cas x А 
Stants appear in the formula, л (the ratio of the diameter of a circle to its 


Circumference, approximately 3.14159) and e, important in calculus and 
the base of the natural system of logarithms, approximately 2.71828. 
The formula can be simplified by the following steps: 


l. N can be taken as 1. Thus, areas under the curve will be in proportions 
of N and will be more convenient with which to work. 


Se pee e 


3. 
ahg standard deviation in Formula 1 
N sz. Similarly, the mean is indicate 


1.5 іѕа parameter and is written as oz rather 
d by p rather than by M. 
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2. The variable can be measured in z scores with mean of zero and 
standard deviation of unity. Thus c, will drop out in front of the radical 
and the exponent of e will become — z?/2. 

3. ,/2л сап be evaluated numerically as approximately 2.50662. When 
divided into unity, this yields .39894 as the numerator of the formula. 

4. The negative exponent of e can be eliminated by moving the expression 
involving e to the denominator. 


The formula now becomes 
.39894 
== (11.5а) 
in which у is the height of the normal curve when the frequencies are 


expressed in proportions of N and when values of the variable are in 
Z Scores. 


PROPERTIES OF THE NORMAL CURVE 


By examining Formula 11.5a, certain properties of the normal curve can 
be deduced: 


1. The ordinate of the curve at the mean of the distribution is .39894. 
This follows from the fact that any number (other than 0) raised to 
the zero power is 1. At the mean, 2 is 0, and the denominator is e? ог 
(2.71828)°, which is 1. Accordingly, y, the height of the curve, is 
39894, 

- The highest part of the curve is at the mean, where z equals zero. 
As z increases from its minimum, e"? also increases and y decreases. 

3. The curve is symmetrical. Since z is squared, the ordinate will be 
exactly the same for positive and negative values of z. 

- Since the highest part of the curve is at the mean, the mode and the 
mean coincide. Since the curve is symmetrical, the median and the 
mean coincide. Hence, three measures of central tendency are identical. 

5. With large values of z, y approaches zero. Mathematically, y never 

reaches Zero; however, y becomes so small that, for practical purposes, 
it is negligible. As will be shown later, only a small proportion of the 
cases fall outside the limits of +3.00 standard deviations; that 15, 
below a z of —3.00 and above a z of +3.00. 


By the methods of the calculus (equating the second derivative of the 
formula to zero), it can be ascertained that the inflection of the curve 
changes at points exactly one standard deviation above the mean and 
exactly one standard deviation below it. 

As indicated by this Statement and as shown explicitly in Formula 11.5 
(and implicitly in Formula 11,53). the standard deviation helps to “define 
the normal curve." The converse is not true. The normal curve is not 
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wein pios definition of the standard deviation. Irrespective of the shape 
m istribution, the standard deviation is a meaningful concept. 
5. ie when one knows that a distribution is normal, or approximately 
m edge of its mean and standard deviation permits numerous de- 
a = about specific parts of the distribution. The closer an obtained 

pproximates the normal distribution, the more valid the deductions. 


D 

EVELOPMENT OF A TABLE OF THE NORMAL CURVE 

interest in statistics are awkward 
ed in the form of tables. Example 
f the normal curve by reasonably 
e nor the most elegant) 


p ie probability functions of greatest 
11 men they are most conveniently us 
ane illustrates the development of tables 0 
ective (but not necessarily the most accurat 

methods. 
кыш үе part of the example shows how values of the ordinate are 
rom Formula 11.5a and a table of natural logarithms, similar to 


T А 
able E (Appendix) but more detailed. 


EXAMPLE 11.1 
Curve. In this example, the development 


Development of Tables of the Normal 
in four steps: 


о 
f tables of the normal curve is shown 
1. а 
bs computation of values of the ordinate fromz — 
Formula 11.52. The work is shown in Table 11.1, ап 


2 "d Table O in the appendix. 
+ Plotting these values as Figure 11.1, which shows one-half of a normal curve. 


.00 to z — 3.00 by the use 
d the values are those 


T 
ABLE 11.1. COMPUTATION OF VALUES OF y FROM z = .00 TO z = 3.00. 
ANTILOGe 
— z 2/2 or 22/2 ORDINATE (у) 
.00 .00 1.00000 39894 
20 .02 1.02020 .39104 
40 08 1.08329 .36827 
.60 18 1.19722 .33322 
.80 32 1.37713 28969 
1.00 .50 1.64872 24197 
1.20 72 2.05443 19419 
1.40 98 2.66446 114973 
1.60 1.28 3.59664 11092 
1.80 1.62 5.05309 .07895 
2.00 2.00 7.38906 .05399 
2.20 242 11.24586 .03547 
2.40 2.88 17.81427 .02239 
2.60 3.38 29.37077 .01358 
2.80 3.92 50.40044 .00792 
3.00 4.50 90.01713 .00443 
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3. Counting the units between successive pairs of z values, and cumulating these 
counts. 

4. Determining the proportions of the total area of the curve which lie between 
the mean and the particular z value. These proportions, shown in the final 
column of Table 11.2, are, in effect, the values of Table A in the Appendix. 

In Table 11.1 the first column gives z, the deviation on either side of the mean 

in standard deviation units. Ordinates are computed at intervals of a fifth of a 
standard deviation (whereas in Table O, in the Appendix, ordinates are given 
at intervals of .0lc). The second column gives values of the exponent of e in 
Formula 11.5a, corresponding to the z values. For example, at 1.00 standard 
deviation from the mean ( = 1.00), 22/2 is .50. The numerical value of the 
denominator is found by using a table of natural logarithms, similar to, but 
more extensive than, Table E in the Appendix. These logarithms have a base of 
€, or 2.71828, instead of common logarithms with a base of 10. Since any exponent 
of e is the natural logarithm of some number, e**/2 is evaluated by finding the 
antilogarithm corresponding to z?/2 These numbers are displayed in the third 
column. It has already been noted how .39894 can be found as the ordinate at 
the mean. To find other ordinates, .39894 is divided by the appropriate anti- 
logarithm, yielding the ordinates as shown in the fourth column. 


TABLE 11.2. COMPU 


TATION OF PROPORTIONS OF NORMAL CURVE BETWEEN jx 
AND z (THAT IS, x/. 


х 


CUMULATIVE PROPORTION OF 
LIMITS OF UNITS (SQUARES) NUMBER OF PARTITIONING TOTAL AREA OF 
INTERVAL IN INTERVAL UNITS BETWEEN 2 VALUE CURVE BETWEEN 
(z VALUES) но. 11.1 MEAN AND z (x/oz) MEAN AND Z 
0.00-0.20 1590 159.0 20 :080 
0.20-0.40 151,5 310.5 40 156 
0.40-0.60 140.5 451.0 .60 226 
0.60-0.80 124,5 575.5 .80 289 
0.80-1.00 105.5 681.0 1.00 342 
1.00-1.20 86.5 767.5 1.20 385 
120-140 685 836.0 1.40 420 
1.40-1.60 50.5 886.5 1.60 445 
1.60-1.80 38.5 925.0 1.80 464 
180-200 — 270 952.0 2.00 478 
2.00-2.20 17.5 969.5 2.20 487 
2.20-2.40 11.5 981.0 2.40 492 
2.40-2.60 6.5 987.5 2.60 496 
2.60-2.80 3.5 991.0 2.80 497 
2.80-3.00 20 993.0 3.00 498 
Веуопа 3.00 3.0 996.0 


In Fig. 11.1 these ordinat 
the proportions are shown 
pairs of z values, the numbe 
that are not intersected by 
however, there are also s 


€s are plotted on graph paper. Computations to find 
in Table 11.2. First of all on the graph, between 
T Of squares is counted or, rather, estimated. Squares 
the curve are easy enough to count. In each section, 
€veral partial squares, which can be summed only 
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approximately. In Table 11.2 the limits of each interval are given in the first 
column. The approximate number of units (squares) within that interval are 
shown in the second column. The units are, of course, completely arbitrary, since 
their number would vary with the type of graph paper and with how the curves 
were laid out on that particular type of paper. In the third column the units have 
been cumulated away from the mean. The total number of units, including an 
estimated three beyond 3.00g is 996. Since only half of the curve is included, 
these 996 arbitrary units correspond to one-half of the total N. In order to work 
with proportions, N is taken as 1; hence, the proportion of the curve above the 
mean is .500. 


By dividing each cumulative number o ; nui 
arbitrary units), the entries for the last column are obtained. These indicate the 


Proportion of total area of the curve between the mean and the partitioning 
z value shown in the preceding column, and correspond to the entries іп Table A 


in the Appendix. 


f units by 1992 (twice the number of 


AREAS UNDER THE NORMAL CURVE 
Tables of areas (corresponding to frequencies or f cas 
under the normal curve are developed by the use of approximation 
formulas that, in effect, integrate or sum up proportions between specified 
partitioning z values. The logic of developing such a table is illustrated by 
the graphical integration summarized in Table 11.2. | 

In Fig. 11.1 a series of points has been plotted on cross-section paper, 
the location of each point in two-dimensional space being determined by 
its y value (plotted on the ordinate) and its z value (plotted on the abscissa). 
Corresponding y and z values have been taken from Table 11.1. The points 


proportions of cases) 


VALUES OF 
ORDINATE 


ы Units іп each interval of .200 
49 1590 1515 140.5 1245 1055 865 685 505 385 270 175 n5 65 


S 


35 20 30 


20 


10 НЕН 


ЕНДЕШЕ 


2.00 2.20 2.40 2.60 2.80 3.00 


020.40 .60 80 100 120 1.40 1.60 1.80 
z VALUES 


FIG. 11.1. GRAPHICAL INTEGRATION OF THE NORMAL CURVE 


280 AN INTRODUCTION To PSYCHOLOGICAL STATISTICS 


have been connected with a smooth curve, constituting nearly one-half of 
the normal curve, since it extends from the mean to 3.00c above the mean. 

Certain properties of the normal curve, previously noted from its 
mathematical equation, may be observed from the graph. The maximum 
height is at the mean where the z value is .00. The curve drops away 
from the mean with increasing acceleration until the distance of 1.00с 
from the mean is reached. Thereafter, in successive units along the base 
line, the decrease in y value is progressively less. At 3.00 standard deviations 
from the mean, the curve is very close to the base line, but there is no 
indication that it will reach it. 

There are various tables of the area under the curve, but one useful 
version is in terms of the proportion of cases falling between the mean and 
a distance, defined in standard deviation units, away from the mean. An 
example is Table A in the Appendix. In Figure 11.1 this proportion can be 
ascertained from the area, measured in squares, under the curve and be- 
tween two vertical lines, one erected at the mean, the other at the appro- 
priate distance, and divided by the total number of Squares under the curve. 
Computations are shown in Table 11.2. 


READING A TABLE OF THE NORMAL CURVE 


Although the normal distribution curve can be described as a mathematical 
function (such as Formula 11.5) or presented in graphical form (as in 
Fig. 11.1), it is used most frequently in the form of a table. 

Tables of the normal curve come in two principal forms: tables of the 
area (such as Table A, Appendix) and tables of the ordinate or height (such 
as Table O, Appendix). Details of presentation differ widely,? but fre- 
quently (as in both Table A and Table O, Appendix) entry is through 
Х/б,, the distance in standard deviation units from the mean. 

From a table of the area, the Proportion of a normally distributed 
variable between any two limits may be readily found. In Table A the 
total area of the curve is taken as 1.0000, and for each value of x/o,, 
Ог 2, the table shows the proportion of cases between that value and the 
mean. Thus, the proportion of cases lying between the point one standard 
deviation above the mean (х/с, = 1.00) and the mean is .3413. : 

Table A gives values for only one-half the curve, but since the curve is 
Symmetrical, any area can be easily found. If one limit is below the mean 
and the other is above, the areas between each limit and the mean are 
summed. Thus the area between — 1.006 апа + 1.00c is (.3413 + .3413), or 
-6826. Any proportion can be converted into a percentage. Accordingly 


? While both Tables A an 
this chapter and in subs 
four places of decimals, 


d O (Appendix) have five places of decimals, their use ш 
quent chapters assumes that they have been read correct to 
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-1.00 0 +1.00 +2.00 


x/o,= -2.00 
Mean 


FIG. 11.2. SCHEMATIC REPRESENTATION OF THE AREA OF THE NORMAL CURVE 


In this diagram, values of the variable in z form (that is, х/о) are shown on the x 
axis, while the y axis represents proportional frequencies. Proportions of N, the total 
number of cases sum to 1.0000, which is taken as the total area. 


it can be found from Table A that if a distribution is normal, 68.26 percent 
of the cases will fall between the limits of 16 above and below the mean. 
To find the area between two limits on the same side of the mean, it is 
necessary to find the difference between two proportions. From the 
proportion of cases between the mean and the limit farther away, we 
Subtract the proportion of cases between the mean and the nearer limit. 
Thus, to find the area between — 2.000 and — 1.000, we subtract the area 
between — 1.00с and the mean (.3413) from the area between the mean 
and —2.00е (.4772). In this way it is ascertained that in a normal distribu- 
tion, the proportion of cases between — 1.000 and -2.00с is .1359. This 
finding agrees very well with a similar deduction from Table 11.1A, where 
the last column yields in abbreviated fashion the same type of information 
that Table A (Appendix) gives more precisely. For the partitioning z value 
of 2.00, the proportion of the total area of the curve between the mean 
and z is given as .478, while for the partitioning z value of 1.00, the pro- 
Portion is .342. By subtraction, the area between the limits of 1.000 and 
2.000 (or — 1.000 and — 2.000) is .136. 
SOME APPLICATIONS OF THE NORMAL CURVE 


One application of a table of the area of the normal curve is to find the 
limits of a stated proportion of the normal distribution, either in reference 
to the mean or to one or both ends of the distribution. Below are some 
questions, with answers as deduced from Table A (Appendix). 


LIMITS ENCLOSING MIDDLE 50 PERCENT OF A NORMAL DISTRIBUTION 


One question is: Between what standard deviation limits will be found the 


middle 50 percent of a normal distribution? | 
The first step in answering this question from a table that gives only 


One-half the normal distribution is to find the partitioning value that, with 
the mean, encloses one-quarter the distribution. From Table A it is seen 
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that .2486 of the distribution lies between the mean and a z value of .67, 
while .2517 is the proportion between the mean and а z value of .68. By 
interpolation, a z value can be found that, with the mean, encloses a 
proportion of .2500 of the distribution. In this instance the difference 
between the two areas (.0031) corresponds to an increment of 016. The 
difference between the smaller area .2486 and .2500 is .0014. Accordingly, 
the more precise partitioning value that, with the mean, encloses a quarter 
of the distribution is the lower of the two limits, .670, plus (.0014)/(.0031) 
of .01c, which is the difference in the limits. Thus 


0014 


670 + 2031 


(:016) = .6745с 


LIMIT BETWEEN 5 AND 95 PERCENT OF A NORMAL DISTRIBUTION 


Another question is: In standard deviation units, what is the partitioning 
value between the lower 95 percent of a normal distribution and the upper 
5 percent? 

Since the normal distribution is symmetrical, the limit that divides the 
lower 95 per cent from the upper 5 percent will, with sign reversed, also 
divide the lower 5 percent from the upper 95 percent. Figure 11.3 illustrates 
these divisions graphically. These limits are important in the “one-tailed” 
tests of significance to be described in Chapter 13. 

With Table A (Appendix) giving the areas between the mean and the 
partitioning values (in с units), it is first of all necessary to find the par- 
titioning value that Separates 45 percent of the cases from the upper 5 per- 
cent. The other 50 percent of the cases are, of course, below the mean. 
From Table A it is seen that the proportion of cases between a partitioning 
value of 1.646 and the mean is 4495, while the proportion of cases between 
1.656 and the mean is .4505. It is apparent, therefore, that the desired value 
is between 1.64с and 1.656. A reasonably precise value can be found by 
interpolation. To 1.640 is added -0005/.0010 of the distance between 1.640 
and the next partitioning value, 1.650, or .005. Accordingly, in a normal 


distribution, 5 percent of the cases lie above +1.645c, and 5 percent lie 
below — 1.6456. 


LIMIT BETWEEN 1 AND 99 PERCENT OF A NORMAL DISTRIBUTION 


Exactly analogously, the partitioning value between the lower 99 percent 
and the upper 1 percent of a normal distribution can be found. From 
Table A it is seen that -4898 of the cases are between the mean and 2.320, 
while .4901 of the cases are between the mean and 2.33c, The desired value, 
4900, is accordingly .0002/.0003, or two-thirds the distance between the 
tabled values of 2,326 and 2.330. Therefore 2.320 + (2/3).01c, or 2.3270, 
can be taken as the dividing line between the lower 99 percent and the 


Middle 90% 


Middle 95% 


Middle 98% 


- 2.3276 


Middle 99% 


— 
-2.5750 


FIG. 11.3, SELECTED DIVISIONS OF THE NORMAL CURVE 
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upper 1 percent. It also follows that 1 percent of the cases fall below 
—2.321c. 


LIMITS ENCLOSING MIDDLE 99 PERCENT OF A NORMAL DISTRIBUTION 


A. pair of symmetrical limits often useful in testing hypotheses are the 
partitioning values that include the middle 99 percent of a normal distribu- 
tion. One of these limits will, of course, be above the mean and will be 
positive in sign; the other will be of the same absolute size, but will have a 
negative sign. Each of the desired limits will be at a point that, with the 
mean, includes .495 of the cases. 

Table A (Appendix) gives .4949 as the proportion between the mean 
and 2.570, and .4951 as the proportion between the mean and 2.580. 
The desired value is 2.576 + (.0001/.0002).010, ог 2.5756. Accordingly, 
1 percent of the cases in a normal distribution can be expected to fall 
outside the limits of —2.575¢ and --2.5756. 


LIMITS ENCLOSING THE MIDDLE 95 PERCENT 


The value that encloses the middle 95 percent of a normal distribution can 
be read directly from Table A without interpolation. Since .0250 of the 
cases will be above this limit, .4750 will lie between the limit and the mean. 
Table A gives this partitioning value as 1.96c. Accordingly, 95 percent of 
the cases are between — 1.966 and +1.96c. 

Later it will be noted that certain important statistical functions are 
normally distributed, and in Chapter 13 it will be seen how specified 
partitioning values in z form, dividing the normal curve into two segments, 
can be used in testing hypotheses. 


USE OF A TABLE OF ORDINATES 


Table O (Appendix) of the height of the ordinate at stated х/с, ог z values 
is not as useful in educational and psychological research as Table A. 
It could be used to find y in Formula 9.22 for biserial r. However, the 
information in Table P is arranged more conveniently for this purpose. 


INFERENCES ABOUT DISTRIBUTIONS WHEN NORMALITY IS 
ASSUMED 


If a distribution is normal, knowledge of three constants—N (the total 
number of cases), the mean, and the standard deviation— permits inference 
as to the number of cases between any two values or between a single value 
and either end of the distribution. If N is unknown, similar inferences can 
be made, but only in terms of proportions or percentages. | 

If a proportion of the total distribution is stated, together with one limit 
(which may be one end of the distribution), the other limit may be found. 
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No observed distribution is ever perfectly normal, but in many cases an 
empirical distribution can be taken as representing a normal population 
If a distribution departs considerably from normality, inferences on tlie 
basis of normal curve properties may still be useful, but of course the 
Breater the departure from normality, the less valid the inferences are 
likely to be. 

o Another limitation in applying the mathematically continuous normal 
urve to psychological data is that many observed psychological variables 
are effectively discrete. An I.Q., for example, may be either 117 or 118; it is 
never recorded at an intermediate value. This results in slightly less valid 
inferences than might be the case with truly continuous variables. 
The problems below can all be solved with information in Table A. 


PREDICTION OF NUMBER OF CASES BELOW A STATED CUT-OFF POINT 


= a certain mechanical aptitude test, the mean is 15.4 and the standard 
eviation is 4.6. On the assumptions that the variable is normally distributed 
on that future applicants for employment will show the same M and s, 
um what cut-off point can 85 percent of the applicants be expected to 

Solution: The first requirement 
Partitioning value between the lower 
Of a normal distribution. It will be t 
Cases from the mean. From Table A, we have: 


is to find the value of x/c, that is the 
85 percent and the upper 15 percent 
he point that separates .3500 of the 


PARTITIONING AREA TO 
MEAN 


ва н 
1.030 13485 
1.040 (3508 


value corresponding to an area of 
= 1.0376. In the empirical distribu- 
ould be expected to be 15.4 4- 
ld expect approximately 85 per- 
of the scores to be 21 


By interpolation, the partitioning 
eg is 1.030 + (.0015/.0023)(.0100) 
ion, the point 1.0375 above the mean W 
(1.037)(4.6), or 20.2. Accordingly, one wou 
Cent of the scores to be 20 or less and 15 percent 
or more. 
PREDICTION OF NUMBER OF CASES BETWEEN STATED LIMITS 
is intelligence is normally distributed and ifan intelligence test is standard- 
so that the mean I.Q. is 100, with a standard deviation of 14.83, how 
Many children in 1000 would be expected to have LQ/s of 130 to 134, 


Inclusive ? 
үр; e With integral 1.Q.’s, the appropriate partitioning values are 
: 9.5 and 134.5. In c units these partitioning values аге (129.5 — 100)/14.83 

nd (134.5 — 100)/14.83, or 1.990 and 2.33с. These values are found by 
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applying the usual formula for a z score, z = (X — M,)/s,. From Table A 
it is seen that .4767 of a normal distribution lies between the mean and 
1.99с and .4901 between the mean and 2.330. Subtracting .4767 from 
-4901 yields .0134 as the proportion between the stated limits. Accordingly, 
13 children in 1000 could be expected to have I.Q.'s of 130 through 134. 


STANDARD SCORES WITH PREDETERMINED CHARACTERISTICS 


In reporting the results of a civil Service examination, a system of standard 
Scores is to be used. A mean and standard deviation (M' and 5^) are to be 
assigned such that, on the assumption of normality, 10 percent of the 
converted scores will be 70 or below and 2 percent will be 98 or above. 
Find М” and ғ, . 

Solution: Information about an Observed mean and standard deviation 
is not required, since the problem relates to the normal curve generally. 
By interpolating in Table A it is inferred that .4800 of a normal curve lies 
between the mean and the partitioning value of -- 2.05406, while .4000 lies 
between the mean and — 1.28176. Accordingly, 2 percent of the cases can 
be expected to be above +2.0540¢ and 10 percent below — 1.28170. The 
28 specified standard-score units (70 to 98) correspond to a range of 
(2.0540 4- 1.2817)о, or 3.33570. Dividing 28 by 3.3357 yields 8.39 as the 
5' to be assigned as the Standard deviation of the standard scores. Since 
M' is to be 1.28176 above 70, M' is 70 + (1.2817 x 8.39), or 80.75. 


DETERMINATION OF SCALE VALUES OF ITEMS 


On an attitude test, percentages of subjects agree with statements A through 
Gas follows: A, 87; B, 68; C, 45; D, 55; E, 30; F, 5; and G, 15. What are 
plausible scale values of these statements? 

Solution: On the assumption that each item is a measure of a normally 
distributed characteristic for which only dichotomous information (agree- 
disagree) is available the s distance from the mean to the point of dichotomy 
may be taken as the working scale value. Such values may be adjusted by 


changing the reference point from the mean to an arbitrary origin. Com- 
putations are shown as a table below. 


AREA BETWEEN DIRECTION OF 


POINT OF POINT OF 

STATE- PERCENT DICHOTOMY DICHOTOMY ADJUSTED 
MENT AGREEING AND MEAN FROM MEAN 5 VALUE S VALUE 
A 87 -3700 Above 1.13 2.83 

B 68 :1800 Above 47 2.17 

С 45 .0500 Below —.13 1.57 

D 55 .0500 Above 43 1.83 

E 30 -2000 Below —.52 1.18 

Е 5 -4500 Below —1.64 06 

G 15 .3500 Below —1.04 66 
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a. area between the point of dichotomy and the mean is found by 
a tting the percent agreeing with the statement to a proportion and 
ihe ier eig deque value indicates that the dichotomy is below 
tis fees toe ss (Appendix), values in standard deviation units 
wie po all scale values positive, any convenient constant may be 
Enc o the obtained s values. In the last column of the table here, 1.70 
mee m added to the s values of the preceding column. These final values 
авы у represent relative popularity (or difficulty іп the case of test 
Mh somewhat more adequately than do the original percentages, 
ough, of course, there is no change in order. 


THE PROBABLE ERROR AND THE STANDARD DEVIATION 


енені the major applications of the normal curve is in connection with 
an 8 a observation or measurement, which are often found to be nor- 
"md istributed. Accordingly, it is appropriate that when Q (half the 
са се between the 75th percentile and the 25th percentile, as described 
iroa Pr 4) is applied to a normal distribution, it is called the probable 
the - It has already been found that 25 percent of the cases lie between 
is b ыс and .6745о. Accordingly, ina normal curve, P55 (or —1P.E.) 
sabes — 67450 and Р,; (or +1Р.Е.) is at +.6745c. From Table A the 
ationships in the following table can be worked out: 


PROPORTION 
PARTITIONING VALUES OF TOTAL 
P.E. UNITS с UNITS AREA ENCLOSED 
+1.0000P.E. + .67450 .5000 


2:2.0000Р.Е. 1.34900 18226 
250000Р.Е. | --2.0235а 9570 


+4.0000P.E. 2.69800 .9930 
+1.4826P.E. +1.0000c .6826 
+ 2.9652P.E. +2.00000 9544 
3.00000 .9973 


NONNORMALITY IN DISTRIBUTIONS: ASYMMETRY, OR SKEWNESS 


Bed obtained distributions of psychological and educational variables 
a: cie from the normal; very frequently by being skew, or asymmetrical, 
will s in Fig. 11.4. If the items of a test are very easy, high total Scores 
tis be more numerous than low scores, and the distribution will be 

Batively skewed and the tail at the low end of the distribution will be 


e 5 ; 
longated. On the other hand, if the test 1s composed of very difficult items, 


о s : : 
W scores will be common, high scores will be relatively rare, and the 


Pe ассы 
3 
Abbreviated either P.E. or PE. 
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distribution will show positive skewness, that is, the long, thin tail will 
extend toward the high side of the distribution. 

There are other reasons for skewness. In some cases, the true distribution 
of a human characteristic, such as weight, may be skewed rather than 
symmetrical. In other cases, units of measurement that yield equal 
numerical differences may not represent equal differences in the character- 
istic. If the units at one end of the scale are systematically larger or smaller 
than the units at the other, the resultant distribution will be skewed, even 
if the underlying trait is normally distributed. 


A skewed curve — negative skewness A skewed curve — positive skewness 


A leptokurtic curve 


A platykurtic curve 


FIG. 11.4. SOME NONNORMAL CURVES 


Another cause of skewness is selection of a sample on the basis of a 
correlated variable. Suppose, for example, all 10-year old children in a 
School system with 1.Q.’s of 115 or more were given a reading test. 
The intelligence distribution would certainly be asymmetrical because it 
would be the upper portion of a more or less normal distribution. It is 
virtually certain that the Scores on a well-made reading test administered 
to this group would also be positively skewed, although not as much as the 
1.0.5. On the other hand, a variable uncorrelated with intelligence, such 


as pitch discrimination, might show a normal distribution even in this 
highly selected group. 


PROBABILITY AND THE NORMAL CURVE 289 


FORMULAS REFLECTING SKEWNESS 


The simplest procedure for the detection of skewness is direct inspection of 
the tabulated frequencies. However, several formulas exist for obtaining a 
coefficient reflecting the degree of skewness. The common feature of these 
formulas is that negative coefficients indicate prolongation of the distri- 
bution toward the low values and concentration of scores at the high end 
of the scale, while coefficients carrying the plus sign indicate the reverse. 
The coefficients computed by the different formulas are not comparable, 
and with one exception, they are difficult or impossible to evaluate statis- 
tically. The logical bases of the formulas, however, help to clarify the 
concept of skewness. m 

In a normal distribution, mean, median, and mode coincide. An 
abundance of extreme scores at one end of the distribution lowers or 
raises the mean, but in general does not affect the other two measures. 
Therefore, if the mean is less than the median (or mode), the distribution 
is negatively skewed, whereas if it is greater, the skew is positive. One of the 
earliest measures was developed by Karl Pearson: 


. mean — mode (11.6) 
5 


which is simply the difference between the two measures in standard devi- 


ation units. However, the mode envisaged by the formula is not the 
familiar crude mode (the score of greatest frequency, or some function of 
the step with largest numbers of cases), but rather a statistic estimated as 
three times the median less twice the mean. The following formula involves 
the median (or 50th percentile) and is merely Formula 11.6 revised in 
accordance with the procedure for estimating the mode. 


3(mean — Ро) 
Sk= “mean a (11.6a) 


The next measure also capitalizes on the difference — : аа 
that tends not to be affected by skew (again the median) an 
les, Ро and Poo. Although 


likely to be affected, the mean of two percenti 
(Pio + Poo)/2 is a “point measure,” its behavior resembles that of the 


Mean in that it may change when a substantial proportion of the values at 
One end of the distribution are not symmetrical with a similar proportion 


at the other end. The formula is 


p= Ci T Poo) p (11.7) 


nt all the values in the distribution. 


The isti s into accou 
та ply the mean of the cubes of all the 


It is generally noted as 91, and is sim 
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values in z form: 


xz 


FS (11.8) 


= 


MOMENTS ABOUT THE MEAN IN TESTING SKEWNESS 


Formula 11.8 sometimes is written in terms of the moments about the 
mean, which may be denoted as тү, m2, тз, m4, and so on. The first 
moment is the mean of the deviations from the mean and is, of course, 
always 0: 


У, 
m = =0 


The second moment is the mean of the squares of the deviations from 

the mean, and is the sample variance: 
=x" 
m= N =V, =s? | 

Similarly, the third moment about the mean is the mean of the cubes of 
the deviations; the fourth moment, the mean of the fourth powers; the 
fifth moment, the mean of the fifth powers; and so on. 

It is readily seen that in a perfectly symmetrical distribution (including a 
normal distribution), all odd-numbered moments are zero. The first 
moment is not affected by the shape of the distribution, and for statistical 
purposes, moments beyond the fourth are seldom used. Consequently, a 
measure of skewness is built on the third moment, Zx?/N. 

Because of the * weight" of the values in a long tail, any distribution 
with positive skewness will have a positive third moment, while a distribu- 
tion skewed toward the lower limit will have a negative third moment. 
However, the absolute magnitude of the third moment, as is the absolute 
magnitude of a variance, is a function of the units of measurement. By 
dividing the third moment by the cube of the standard deviation, a 
Statistic, 4), is obtained that is independent of these units; it is, in fact, the 
mean of the cubes of the z scores. 

A computing formula in terms of the sum of deviations, the sum of 
squares of deviations, and the sum of third powers of deviations, all in 
terms of step intervals from an arbitrary origin, is given as: 


p= ту = ха? Е N?xx? — 3NEx'2Ex! + =x’)? 
mam, N СМУ — (zx y p? 
As are the measures of skewness involving percentiles, obtained values 


of this coefficient are difficult to evaluate because the expected distribution 
of the statistic based on successive samples of N cases each, drawn at 


(11.82) 
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лы from an unlimited normal population, is unknown. То remedy this 
ifficulty, Fisher has suggested a modified g,, with know i 
distribution, as follows: | кашан 
, Ха [N(N —1) 
p= Ww- (11.8b) 
Standard errors and their applications are to be treated in Chapter 13. 
The standard error of the 47, of Formula 11.8b is 


aes 6N(N — 1) (11.9) 
ні (її — 2)(N + DN + 3) ` 


d deviation of a large distribution of g'i, 
le of N cases drawn from a normal 


This is the expected standari 
each computed from a random samp 
population. 

The use of Formulas 11.8a and 11.8b in testing skewness 


in Example 11.2. 


is illustrated 


EXAMPLE 11.2 


USE OF HIGHER MOMENTS IN TESTING FOR SKEWNESS AND KURTOSIS 


res of 1016 high school seniors on à 


Inspection of the distribution of the sco 
1, with neither skewness nor kurtosis 


test shows that it is reasonably symmetrical 
apparent. Nevertheless, tests for these characteristics may be applied. 
Skewness. For finding gi (defined as mg/m23!2 = ms[sz? —Xz3[N), the first, 
second, and third moments around the mean or around an arbitrary origin are 
necessary. Formula 11.8a is written in terms of x’, but either x or X can be sub- 


stituted without other change, since the formula applies to: 


the deviations in original units 


Ух would be zero); | 
interval units from an arbitrary 


1. Sums of powers of x, from the mean (in 
which case all terms involving 

2. Sums of powers of x’, the deviations in step- 
origin; and 

3. Sums of powers of X, or raw scores (that 
from zero). 


t is, deviations in original units 


In the numerical example (Table 11.3), frequencies are multiplied by cor- 
p intervals) or powers of d to form 


responding d (deviations in terms of ste. Й 
ther with N, are then substituted in 


=a Хх? and Хх. These values, toge н 4 
ormula 11.8a to find gi, which is + -056, showing а very slight tendency to 


Positive skewness. 

To find Fisher's the val 
plied by V N(N — DKN — 25, which in 
re the correction is negligible and g1 15 

rd error of 071 is .077. 


value of Zz3/N (which happens to be + .056) is multi- 
this case is 1.0015. The N is so large 
unchanged. By Formula 11.9, the stan- 
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TABLE 11.3. TESTING FOR SKEWNESS AND KURTOSIS 
(Scores of 1016 High School Seniors on a Reading Test) 


STEPS £ а а? аз а 
33-35 3 5 25 125 625 
30-32 14 4 16 64 256 
27-29 38 3 9 27 81 
24-26 129 2 4 8 16 
21-23 191 1 1 1 1 
18-20 233 0 0 0 0 
15-17 215 =] 1 | 1 
12-14 128 -2 4 -8 16 
9-11 52 -3 9 —27 81 
6-8 10 —4 16 —64 256 
3-5 3 -5 25 —125 625 


N= 1016 
Computations 
Xx’ = Хау = —48 
Lx’? = Ld?f = 2778 
Zx’? = аЗ = —138 
Ух” = Хау = 21,702 
By Formula 11.8a, 


E2? NINX80— ЗМУ XXX 
N [N Xx? — (x? 
— (1016)(— 138) — 3(1016) (2778) (—48) + 2(—48)8 
[(1016)(2778) — (—48)°]37# 
_ 263,760,000 
5,900000 — 
(2,820,144) (1679.328) 
By Formulas 11.10 and 1111; 


A= 


= +.056 


ха 3 Мхи — AN?Yx IY! + ONEX (Ex)? — (ХА 


Фа------ 


N [Мх — (Ex? 


_ 22,772,102,968,320 
7,953,212,180,736 


3 


3 = 2.863 — 3.000 = —.137 


It has already been noted that the middle 95 percent of the cases of a normal 
distribution lie between the limits of —1.960 and +1.96c. The Standard error 
of g^; can be taken as an estimate of the standard deviation of an indefinitely 
large number of g^, each computed froma sample of 1016 cases drawn randomly 
from an indefinitely large, normally distributed population. Therefore, 5 percent 
of these 971 Could be expected to exceed +1.96 x .077; that is, they would be 
outside the limits of — 4 51 and +.151. Since the obtained 9^, (+ .056) is well 
inside these limits, the value is such that it would ordinarily be accepted as 
deviating from 000 only by chance. Hence the distribution can be regarded as 
showing no significant degree of skewness, 

Kurtosis. To test the Peakedness, or kurtosis, of a distribution, we first find the 
sum of the fourth powers of the z scores by means of Formula 11.11 which, like 
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the formula for Xz?/N, is applicable to sums of powers of deviations from the 
mean, or to deviations in step-interval units from an arbitrary origin, or to raw 


scores. The numerical example yields: Ex’, Ух”, Ух”, and Хх, 

In a perfectly normal distribution, Zz*/N has а value of 3.00. Accordingly, 
3.00 is subtracted from X:z^/N to find дз, a measure of kurtosis. In the example, 
ga —Xz^]N — 3,00 = 2.863 — 3.000 = —.137, indicating a very slight tendency 
for the distribution to be flat-topped. However, by Fisher's method of testing 
the significance of gz (as computed by a formula for which the standard error 
is known), — .137 is not significantly different from zero, and as far as this test 


is concerned, the distribution is normal. 


NONNORMALITY IN DISTRIBUTIONS: PEAKEDNESS, OR KURTOSIS 
Two symmetrical distributions with identical standard deviations may 
If the distribution has a high 


nevertheless differ considerably in contour. 
peak, with long tails going out in either direction, it is called /eptokurtic, 


but if the distribution is relatively flat-topped, with stubby tails, it is called 
Platykurtic, Examples are shown graphically in Fig. 11.4. Intermediate 
between the leptokurtic and the platykurtic distribution is the mesokurtic, 


exemplified by the normal curve. 
The use of adjectives to describ 
Metrical distributions is undoubted, 
peaked and flat-topped are probably just as useful as the fan 
derived from the Greek. Nonsymmetrical distributions also differ in 
Peakedness, and some method of measuring this characteristic is desirable. 
It can be demonstrated mathematically that the mean of the fourth 
Powers of the z scores of a perfectly normal distribution is 3.00. It is also 
generally true that distributions with considerable peakedness tend to have 
values of Xz^/N greater than 3.00 (because of the high values of the fourth 
Powers of the values out in the long tails), and that distributions of the 
Same variance, but flatter than normal, tend to have values of Xz*|N less 
than 3.00. Accordingly, g2, aS an indication of peakedness, may be de- 


fined as 


e the relative shapes of obtained sym- 


y helpful, although the common terms 
cier words 


4 
NM NE (11.10) 


aa = mj x N 
. With notation identical to that used in Formula 11.8a and with Xx'* 
Indicating the sum of the fourth powers of deviations in terms of step 
intervals from an arbitrary origin, à computing formula for Ez^]N is 


Ez^  N3xx'5 — AN?Zx EX + 6NXx' (Ex?) — З(5х/)* (11.11 
N [NEx? - Ex YT _ 


h moment, M4, has been divided by the 


In Formula 11.10 the fourt de 
to eliminate the effect of the units of 


Square of the second moment, 7: 


294 AN INTRODUCTION To PSYCHOLOGICAL STATISTICS 


measurement. It can be shown that m,/m,* is precisely the mean of the 
fourth powers of the z scores, 

Fisher (1) has proposed a modification of 92, a modification with a 
known sampling distribution, to which the interested reader is referred. 
There is also a “measure of kurtosis” based on percentiles. However, the 
utility of all these coefficients as measures of the degree of one type of 
departure from normality has been questioned by mathematical statis- 
ticians. However, the use of adjectives to describe “ peakedness,” with 
Formula 11.10 to indicate the direction of the variation from normality, 
may be useful occasionally. Such a use is demonstrated in Example 11.2. 


TESTING A DISTRIBUTION FOR NORMALITY WITH CHI SQUARE 


When frequencies are distributed in categories, it is possible to generate a 
set of theoretical frequencies, with identical total N, distributed in the 
Same categories in accordance with some principle, Whether or not the 
difference between the two sets of frequencies can be accounted for by 
sampling variation can be tested by chi square, already described in 
Chapter 3, 

Chi square affords a method of testing the agreement between any 
observed grouping of frequencies and the way the same total number of 
frequencies would be distributed in accordance with some hypothesis (in 
the present instance, the normal curve). А 

Actually, the curve with which the observed distribution is compared is 
not precisely the continuous normal curve as generated mathematically, 
but rather is a distribution in discrete steps (much like a histogram) that 
closely approximates the normal curve. This theoretical distribution has 
the same N, mean, and standard deviation as the observed distribution, and 
in practice, the difference between this theoretical distribution and the true 
normal curve is of no importance, 


Computational details are shown in Example 11.3. Briefly the method 
involves: 


1. Determining the mean and standard deviation of the observed 
frequency distribution, : 

2. Developing the theoretical distribution, the one that would obtain if 
the hypothesis of a normal curve were true. The theoretical distribution 
must have М, М, and s identical with the observed distribution, and the 
step. frequencies must be such that they yield the closest possible approxi- 
mation to the normal curve, 

3. There is a further requirement that no f, (the frequency expected in a 
category) be less than 5, When one or more f, as computed are less than 
5, categories in the theoretical] distribution must be consolidated to obtain 
2 of 5 or more. Corresponding categories in the obtained distribution are 
also consolidated. This requirement is sometimes relaxed when the number 
of categories (and hence the number of df) is large. 
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, 4. Chi square is computed by finding the differences between frequencies 
in corresponding categories (one from each distribution), squaring these 
differences, dividing each square by its fes and summing the quotients. 
This is Formula 3.1, ? 2X(fo — J) [fe In Example 11.3 an algebraic 
variant of this formula is employed, but the value of у? is exactly the same. 

5. As is always the case, the evaluation of a computed value of y? 


requires knowledge of the degrees of freedom (df), since a chi square table 
is in effect a series of curves, one for 1df, another for 2df, a third for 3df, 


and so on, until df becomes large. By entering a chi square table (Table C, 
Appendix) by means of df and the numerical value of y?, P may be deter- 
mined. P is the probability of obtaining a 7? as large as the one observed 


Simply by chance. 


EXAMPLE 11.3 


TESTING A DISTRIBUTION FOR NORMALITY WITH CHI SQUARE 


11.4 the first column gives the working 
value one-half unit below the 
limit of the top step is 32.5. 


In the numerical example in Table 
step limits, with the true lower limit or partition 
stated lower limit of each step. Thus the true lower 


TABLE 11.4 CALCULATION OF THE CHI-SQUARE TEST OF “GOODNESS OF FIT" 


(Scores of 1016 High School Seniors on a Reading Test) 


4) о» о o 9 © 0) С) 
STE) AREA TO 
PS fo x/sz MEAN Pe fe p А 
33-35 3 
2.75 497 .003 3.05 

572 ay 17 215 484 013 131 1626 29 1777 
d 38 1.54 438 046 4674 1444 3049 

4-26 129 04 1326 112 11379 16641 14624 
21-22 191 33 7129 197 200415 36481 18227 
18-20 233 27-106 235 23876 54289 — 22738 
Ел аз Саф on 205 20828 46225 221.96 

2-14 128 1148  —432 420 12192 16384 134.38 

е 1 52 —209  —482 051 51.82 2,04 52.18 

10 —2.9  —496 014 p 
3.5 $n "AN 42 18.28 169 925 
Efe = 1016.00 X(fo?/fe) = 1022.12 

Computations 

N= 1016 


M.—1886 у= GaP? gf N=107212— 1016-612 


Sz = 4,96 df=n'—3=6 
50> P> .30 
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The second column gives the observed frequency (fo) of each step. 

The third column shows the true lower limit as a deviation from the mean, 
divided by the standard deviation of the distribution; that is, (Жз — Mz)/sz =x1/sz. 
Using this information with a table of area of the normal curve, such as Table A 
(Appendix), the entries of column 4 are found: the proportion of a normal curve 
between the lower step limit and the mean. For example, when Table A is entered 
with x/sz of 2.75, it is found that .497 of a normal distribution lies between this 
limit and the mean. It follows that .500 — -497, or a pe (expected proportion) 
of .003 would be anticipated in the top step; (.497 —.484), or a pe of .013 in 
the second step; (.484 — :438), or .046 in the third step, and so on. For steps 
for which x/sz is negative, the area has been denoted with a minus sign to show 
that the determination of the expected proportion of cases on any step is algebraic; 
that is, for the step that includes the mean, pe = .129 — (—.106) = .235, and for 
the step below, pe = —.106 — (—.311) =.205. 

Entry in the sixth column, (/:), the frequency expected in the step on the 
hypothesis of a normal distribution, is determined by multiplying the corres- 
Ponding p. by N; in this case, 1016. 

In this example, two fès are less than 5. Accordingly, fes in the steps at the 
end are combined with f/'s in adjacent steps; and fo's in corresponding steps in 
column 2 are similarly combined. 

Chi square may be readily found by the following formula: 


e- gA (3.1) 


This requires that the difference be found between each / and corresponding 
fe, that this difference be Squared, and then divided by the fc. The sum of all 
these quotients is qs 


A slightly more convenient way to find chi square is by the formula 
2 
v= pem M G.3) 


In column 7 of Table 11.4 are the Squares of the observed frequencies. When 
cach square is divided by the corresponding fz, the result is the quotient in column 
8. The sum of these quotients less N is x?, 

Instructions booklets for most desk calculators give a procedure for summing 
quotients when a series of divisions is performed. 

In developing the normal distribution in column 6, it was made so that the 
number of cases, the mean, and the standard deviation would be identical with 
those of the observed distribution. This reduced the df for the chi-square test 
D 9 (ғ, or the number of categories) by 3 (the number of imposed restrictions) 
to 6. 

For six degrees of freedom, the P value of a x? of 6.12, is between .50 and .30. 
(A more exact P may be found, if desired, by interpolation.) Since with 6df, 
a x? of 6.12 is expected by chance between 30 and 50 percent of the time, the 
hypothesis of a normally distributed population remains tenable. 
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INTERPRETATION OF CHI SQUARE 


If the population is distributed according to some hypothesis (in this case 
the hypothesis of normality), then a long series of samples might include 
one or more samples yielding a x? of .00 (indicating no difference at all 
between the sample and the theoretical distribution). In general, however, 
the 72 will be greater than zero, and the chi-square distribution for the 
given degrees of freedom can be used to infer the theoretical rarity of any 
obtained chi square. 

When P is .05 or less, the hypothesis of no difference between the 
obtained distribution and a normal distribution is disproved at the 5 per- 
cent level of confidence. In this case, there is only 1 chance in 20 (or less) 
that a chi square as large as the one obtained could have been found if the 
sample were drawn at random from a normally distributed population. 

In Example 11.3, P is between 130 and .50; hence, by conventional 
standards, normality has not been disproved. The distribution can be said 
to be compatible with the hypothesis that the characteristic is distributed 
normally in the population that the sample represents. 


NORMALIZING A DISTRIBUTION 
When an empirical distribution is normalized, a conversion system is 
developed such that the distribution of converted scores is as close as 
Possible to a normal distribution. Since no linear conversion will change 
the shape of a distribution, the usual procedure is to establish partitioning 
values in the form of percentiles, such that the frequencies in the categories 
So formed will approximate a normal curve. 
As an example, consider the problem of 
A, B, C, D, and E, so that a distribution appr 
One possibility would be to have one standard 
each of the five intervals, since from Table A (Appendix) it is seen that 
the limits --2.5c include 98.76 percent of a normal distribution. The few 
cases lying outside these limits can be placed in the A and E categories 


Without much distortion from approximate normality. 


From Table A it is found that the mean and the limit of 50 enclose 
while the mean and 1.56 enclose 


-1915 of the area of a normal curve, : 
-4332 of the area. From these figures the desired proportion of the total 
distribution for each letter grade category is readily found, as shown in the 


accompanying table. 


assigning five letter grades, 
oximately normal results. 
deviation as the width of 


HOW 
PROPORTION OBTAINED 


CATEGORY LIMITS 

A 4-1.5о and above .0668 .5000 — .4332 
B 4.50 to +1.50 2417 4332 — 1915 
с —.5g to --56 .3830 .1915 + .1915 
D L1.5e to —.50 2417 4332 — .1915 
E Below —1.5¢ 0668 5000 — .4332 
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By cumulating the proportions upward and changing decimal points, the 
following percentiles (rounded to whole numbers) are found as partitioning 
values: 


PARTITIONING 
PERCENTILE 
Between A and B . y . Pos 
Between B and C à i - Peo 
Between C and D Я 3 - Pa 
Between D and E м Р „ M 


When a continuous or nearly continuous variable is divided into five 
groups by means of these partitioning percentiles, the resultant five- 
category variable is reasonably close to normal. 


THE C SCALE AND THE STANINE SCALE 


Example 11.4 demonstrates a procedure to divide an obtained distribution 
into 11 categories yielding a close approximation to the normal curve, the 
so-called C scale. Here the original distribution appears to be somewhat 
Skewed negatively, whereas the converted distribution seems definitely 
closer to normal. However, when N is low, when values are discontinuous 
and limited in range, and when the frequencies are fitted into relatively 
few categories, the resultant distribution can be expected to be only 
approximately normal. 

A variant of the C scale is the stanine scale (“standard nine") originally 
used by Army Air Force psychologists in World War II. In Example 11.5 
the C distribution is converted into a stanine distribution by consolidating 
i eas in steps 9 and 10 as 9's and the frequencies in steps 0 and 1 
as 1%. 

Stanines deviate from the normal a little more than do C scores, since the 
tails of the stanine distribution are blunt. As single-digit scores, however, 
they are easy to handle, both in routine reporting and in statistical analyses. 
Because a single digit is often sufficient to indicate individual differences, 
stanines are often used as test norms. 


EXAMPLE 114 — 
NORMALIZING AN OBTAINED DISTRIBUTION 


To convert an observed distribution into something approximating a normal 
distribution, it is first necessary to choose a basis for the normalization. This 
example is chiefly concerned with the C scale—a distribution of 11 steps extend- 
ing from 0 to 10, with a mean of 5. With the exception of the step at either end 
of the distribution, each step has a width of .5o. 

Table 11.5 shows the development of the partitioning percentiles, which are 
general and applicable to any distribution for which C-score equivalents are 
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desired. Only the pN in the final column are specific to the distribution normalized 


(5) 


PROPOR- 
TION IN 
STEP 


0122 
.0279 
10655 
11210 
1747 
1974 
1747 
.1210 
.0655 
.0279 


in Table 11.6. 
TABLE 11.5. DEVELOPMENT OF A C SCALE (//--256) 
а) о) (3) (4) 
AREA 
LOWER FROM COMPUTATION 
LIMIT x/ozTO С OF PROPOR- 
x/oz ^ MEAN SCORE TION IN STEP 
2.25 .4878 10 -5000—.4878 
1.75 .4599 9 .4878 —.4599 
1.25 .3944 8 .4599 —.3944 
75 .2734 7 .3944—.2734 
25 .0987 6 .2734—.0987 
— з .0987 5 .0987 4-.0987 
= 5 2734 4 .2734—.0987 
—1.25 .3944 3 .3944 —.2734 
—1425 .4599 2 .4599—.3944 
--2,25 4878 1 4878 —.4599 
0 .5000— .4878 


TABLE 11.6. COMPUTATION OF PARTITIONING PERCENTILE 


.0122 


(Scores of 256 applicants on a police aptitude test) 


NORMALIZED SCORES 


© 


CUMU- 
LATIVE 
PROPOR- 
TION IN 
STEP 


1.0000 
9878 
9599 
.8944 
1134 
.5987 
4013 
2266 
.1056 
.0401 
.0122 


0) 
PARTI- 
TIONING 
PERCEN- 
TILE 
(LOWER 
LIMIT) 


Pos.78 
Pos.99 
Ps9.44 
Рам 
Ps9.87 
P30.13 
P22.66 
P10.56 
Pa.o1 

P1.22 


(8) 


252.88 
245.73 
228.97 
197.99 
153.27 
102.73 
58.01 
27.03 
10.27 
3.12 


S AND REDISTRIBUTION 


OBTAINED C SCORE STANINE 
DISTRIBUTION PARTITIONING DISTRIBUTION DISTRIBUTION 
mes / PERCENTILES STEPS CGscore f  STANINE f 
i ge 2 75 ог төге 10 2 4 fa 
-75 8 Pos.18 = 74.9 72-4 9 
68-71 11 Ра аА 67-71 s 2 8 20 
64-67 25 Раула = 66.5 62-66 7 ж 7 28 
60-63 30 Ртза= 61.9 56-61 сше 9 
56-59 27 Реш-555 350-55 5 47 5 
or on we 1: $i м 
48-31 34  —Pwis—494 35-43 3 
7 3 40.18 rend s 20 2 т 
0-43 15 Ралв-432 21-26 1 5 
= (OR 20 or below 0 4 
-35 18 —344 
28-31 3 P10.56 
427 3 Ріш-265 
20-23 3.01 = 26. 
16-19 $ P1.22 = 20.3 
12-15 1 


Examples of Computation оў. Percentiles 


(PN = 3.12): 


124-203 


Pi = 19.5 + “Er 


@N=10.27): Рао = 23.5 E x4e265 
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Column 1 of Table 11.5 gives the partitioning z scores (x/oz) at the lower 
limit of each step. Corresponding areas of the normal curve to the mean (from 
Table A, Appendix) are shown in column 2. C scores are given in column 3. 
The proportion of the distribution to be allotted to each step is indicated in 
column 5, and the computations to obtain the required proportions are shown 
in column 4. Except for the C score of 5, where the areas on either side of the 
mean are added, the proportion in each step involves subtraction of one area 
from another. In the case of the two end steps, .5000 represents the area in one- 
half the curve. 

In column 6 the proportion of cases in each step has been cumulated. These 
figures, each multiplied by 100, give the required partitioning percentiles at the 
top of the step. In column 7 are shown the partitioning percentiles at the lower 
limit of the step. The pN of column 8 are found by multiplying the cumulative 
proportion at the bottom of the step by N, which is 256. These figures are useful 
in finding the ten needed percentiles by the method described in Chapter 4. 

In Table 11.6 two examples of finding numerical values for partitioning 
percentiles are given. In each case the required percentile is the lower limit of 
the step containing the percentile plus a fraction of the step interval, which in 
this case is 4. The fraction is pN less the number of scores below the step, divided 
by the step frequency. 

The frequencies in the C score distribution cannot be found from the distribu- 
tion at the left of Table 11.6 because the frequencies there are grouped in 
categories. They were obtained by distributing the original scores in the C steps, 
which vary considerably in interval. As examples, scores of 72, 73, and 74 are 
converted to 9; scores of 35 through 43, inclusive, are converted to 3. Thus it 
is seen that the transformation is not accomplished by a linear equation, but 
rather by sorting the original scores into categories that yield a modified distri- 
bution. 

The C-score distribution is not perfectly normal, since it lacks complete sym- 
metry and tends to be flat-topped. Deviations from normality can, however, 
be explained as resulting from chance. 

The stanine distribution is obtained from the C-score distribution by com- 
bining the two categories at either end of the C scale. 


THE CENTRAL LIMIT THEOREM 


An important principle for inferring characteristics of the population from 
properties of samples is the central limit theorem. As sample size increases, 
many statistics, including the mean, the standard deviation, and the 
variance, have a distribution that becomes more and more normal, and 
this may be true even though the underlying variable is not normally 
distributed in the population. A requirement for the theorem is that the 
underlying variable have finite mean and variance, but this is generally true 
of variables of interest in psychology. 

The distribution of a statistic is, of course, the tabulation that would 
be expected if the statistic were repeatedly computed on a long series of 
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random samples of identical size, each representing the same population 
When such a distribution is normal, it is adequately described by its 
standard deviation, and knowledge of such standard deviation (that is, the 

standard error” of the statistic) facilitates deductions about the popu- 
lation, as will be discussed further in Chapter 13: 

Here, in connection with chance error, are the most important impli- 
cations of the normal curve. The fact that the error in the estimation of a 
parameter may be distributed normally and with known variance facilitates 
the use of statistical methods as tools in the development of dependable 


knowledge. 


SUMMARY 


of the theoretical distributions developed from probability theory and of 
interest in statistics, the so-called normal distribution has the widest range 
of applications. This distribution curve is symmetrical about its mean, 
with cases becoming rarer as departure from the mean becomes greater. 
It has points of inflection one S.D. above and below the mean and varies 


in either direction without limit. 

Many empirical variables appear to be 
exist to determine whether an observe! 
normal in skewness and kurtosis. By means of the chi-square test of 
goodness of fit, it is possible to decide whether an obtained distribution 
а significantly from normality or whether the hypothesis of a normal 
EE remains tenable. When a variable can be regarded as normally 

uted, the use of a table of the normal distribution makes various 


deductions possible. 
The fact that many statistics are nor 
ance in inferring characteristics of the popu 


Specific samples. 


distributed normally. Procedures 
d distribution differs from the 


mally distributed is of great import- 
lation from observations of 


EXERCISES 


ble of logarithms to the base e, find y, the 


l. Using Formula 11.5a and a ta 
t 2=.50, z = 1.50, and z = 2.50. Compare 


ordinate of the normal curve a 
results with values found in Table O (Appendix). 


2. In an experiment on generalization, the subjects’ task is to arrange four objects 
in a certain order. What is the probability of a subject’s doing the task cor- 


rectly completely by chance? 
ermine the height of the ordinate of the normal 


3. By interpolation in Table O, det 
iles: Pi, Ps, Pio, P20, Pao, Pao, Pso, Peo, Рто, 


curve at the following percent 
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Pso, Poo, Pos, and Poo. Plot the resultant y values on coordinates with appro- 
priate z values on the base line. Does the curve appear to be normal? 


4. For the distribution in Example 5.4 compute д^, Fisher’s measure of skewness, 
and determine how many standard errors g^: is from the value of .00 expected 
for a sample representing a normal population. 


5. Find the partitioning percentiles by which the distribution of Example 5.3 
might be converted to stanines. 


6. Apply the chi square test of goodness of fit to the distribution of Example 
5.4 to determine whether the hypothesis of normal distribution is tenable. 


7. Find the partitioning percentiles needed to convert a continuous variable to 
а seven-category “normal” scale, high values ranging from 1 to 7, centered 
at 4, and with a band width of .8c. 


8. A professional school uses as a selection device an aptitude test with M — 500 
and S.D. — 100. It has a quota of 175 entering students and, in the interest 
of good public relations, wishes to accept any fully qualified student as soon 
as his aptitude score becomes available. 
Make the following assumptions: 

1. The number of applicants who meet other qualifications and who are 
permitted to take the aptitude test will be 800; 

2. Any student accepted will enter; 


3. The mean and standard deviation will continue to be 500 and 100, respec- 
tively; and 


4. The aptitude test is normally distributed. 
What cutting score can be expected to yield approximately 175 entering 


students? 
REFERENCE 


+ FISHER, R. A., Statistical Methods for Research Workers. Edinburgh and 
London: Oliver and Boyd, 1936, p. 339. 


FAMILIES 
OF CHANCE 
DISTRIBUTIONS 


12 


le to statistical problems, the 


Of the theoretical frequency curves applicab 
f applications. Not only do 


Normal distribution has the widest range o 
Many observed variables appear to be normally distributed, but also, as 
has already been mentioned, the normal curve is frequently useful in 
estimating how much variation would be anticipated if a given statistic 
were to be computed from a number of different samples. Often it appears 
as the limiting case in a family of closely related theoretical frequency 
distributions. 

The number of theoretical distributions that have been or could be 
developed on the basis of probability theory is exceedingly large. It is 
beyond the scope of this text to treat any of them with mathematical rigor. 
Nevertheless, workers in the fields of psychology, education, and the 
Social sciences frequently need to appreciate certain families of curves and 
their applications. н 

The rarity of a particular observation ОГ of a statistic based upon a 
number of operations can be estimated by finding the place of the empirical 
information in a model distribution of all the values of the observation or 
Of the statistic possible under a stated hypothesis. If the observed finding 
appears to be highly unlikely under the model, the hypothesis may require 
Tevision or replacement. On the other hand, if observations appear to be 
in accordance with the hypothesis, it may be allowed to stand, pending 
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further evidence. This chapter is concerned with the description of certain 
distributions rather than with their use to substantiate or, to some degree, 
to refute a stated generalization. The matter of testing hypotheses will be 
considered in Chapter 13. 

In a real sense, there is only one normal distribution. Its shape is 
constant despite variation in the number of cases, in the mean, and in the 
standard deviation. It is always symmetrical about the mean. It extends in 
either direction indefinitely, and the points of inflection are exactly one 
standard deviation above and below the mean. 

In contrast, each of the mathematical models described in this chapter 
yields a family of distribution curves, members of which may differ widely 
in contour, but which nevertheless are based on a single underlying 
mathematical function based on probability theory. Because of this 
underlying function, it is mathematically quite appropriate to speak of the 
binomial, or the Poisson, or 72, or the t or the F distribution. However, 
one must be aware that each of these distributions takes different forms 
under different conditions. In the case of the binomial, the shape varies 
with p (the probability of an event) and n (the number of independent 
events); and in other cases the shape of the distribution varies with v, the 
number of degrees of freedom,! which is akin to the number of independent 
events. Since the normal curve can be developed as a special case of the 
binomial when p =q and when the number of independent events is 
indefinitely great, it may be no surprise that there are other curves that 
LÉ approach normality when the number of degrees of freedom become 
arge. 

While any distribution function could be utilized in the form of an 
equation or a graph, the most convenient medium is generally a table. In 
the case of the normal curve, it will be remembered that entry into a table 
is usually in terms of the distance in standard deviation units above or 
below the mean, and the table yields information as to an area or an 
ordinate of the curve. In the case of a table representing a family of 
Curves, a convenient format provides for entry by means of the degrees of 
freedom and the value of the statistic. The table then yields information 
as to the rarity of the observation under defined conditions. 


THE BINOMIAL DISTRIBUTION 


In connection with the discussion of probability, certain aspects of the 
binomial expansion were noted in Chapter 11, together with the use of the 
binomial (when p —4) in the development of the normal distribution. 


1 The symbol v is the lower case Greek letter “nu,” corresponding to n. In this text, 
v always refers to the number of degrees of freedom. 
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Here the binomial will be treated somewhat more generally and as the 
basis for a family of distribution curves. The binomial is a useful model 
when events are discrete and p (the probability of a single event) can be 
considered to be a constant for all events. 

The (n + 1) terms of the expansion (p + 4)" are the proportions of the 
(п + 1) possible values of a function, X, which ranges in value from n 
down to 0, each value being the number of coinciding events. Thus the 
distribution is discontinuous in that it consists of discrete values. 

The proportion of cases within a value of X is given by the formula for 


any term of the expansion of (p + 4)": 


n n! » қ 
Py = (3) a E pq X-n(n—1)-0] (12.1) 
у)? 4 Xxi-3i? 4 Г (n — 1) 

This formula may be readily applied to a simple problem. If an albino 
rat has one chance in five of making a correct discrimination entirely by 
chance (р = .20), what is the probability of exactly four correct discrimi- 


nations in seven trials? Formula 12.1 becomes 


TY аз 7x6x5x4(3x2x1) m 
d aus. 10 ООСН RE 729068) 028672 
Pa (ра БІГЕГІ ТІГІН X8) 


It should be noted that .029 is the probability of exactly four correct 
discriminations, not the probability of four or more, which would or- 
dinarily be required for testing a hypothesis about the rat’s behavior. The 
Probability of a score of four or more would be the sum of the probabilities 


Of scores of 4, 5, 6, and 7; that is 
Х=7 
Y Px 
х=4 
e values of X from п down to and 


The sum of probabilities of all th d 
f the values of X is 


including 0 is, of course, 1.00. The mean o 


Hx = np (12.2) 


and the standard deviation is 


c, = Упра (12.3) 


The variable Y can be converted to z form, with mean of 0 and variance 


Of 1, by the usual procedure: 
f= Ro (124) 
c, пра 


2 
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Skewness can be tested by a variant of Formula 11.8: 


Iz 4-р (12.5) 
N Jnpq 
If q > p, the distribution is positively skewed, while if p >q, the skew is 
negative. The value of g, becomes precisely 0 only in the special case when 
P =q; however, when n is large and p and q not too greatly different, it can 
be seen that 2) becomes very small. 
Kurtosis can be tested by a variant of Formula 11.10: 


n= 


Өніп. (12.6) 
npq 


When п is large and neither p or 4 is very small, 4: is close to zero, and 
hence the distribution meets another criterion of normality. 

Actually, p and q can differ considerably and still yield a distribution 
that is essentially normal. If p < .5 and np > 5, or if 9 «.5 and nq» 5, 
the binomial may be considered normal. Thus the curve produced by the 
binomial under very different conditions tends to approach normality. 
However, many of the uses of the binomial in psychological experiments 
involve cases where the chance distribution is markedly skewed. 

Binomial distributions for n of 6, 12, and 24 are shown in Figs. 12.1, 
І2ЛА, 12.1B. On each graph, two distributions are shown: one for 
p = 25 and q=.75; the other, for p =q —.50. The latter is always 
symmetrical, whereas the former is markedly skewed for n — 6, less so for 
n — 12, and almost symmetrical for п = 24. Since p = .25 (which is less 
than .50) and np = 6 (which is greater than 5), this last distribution can be 
considered essentially normal. 


APPLICATION OF THE BINOMIAL 


An example of the use of the binomial follows: On an achievement test 
of 100 five-choice items, how high a score would a student have to attain 
before it could be said that his performance was not “just chance?” 

In the use of the binomial in this problem, it is assumed that with a five- 
Choice format, a p of .2 is constant for each item, and that the items are 
independent in that success on one item has no effect on success on any 
other. , 

Actually, any total score from 100 down to 0 could happen “by chance" 
(that is, without the student having any knowledge of the subject matter), 
but high scores would be exceedingly improbable. By Formula 12.1 the 
probability of answering all 100 questions correctly by chance is ӨЙ 14 
proportion that is almost infinitesimally small, but which is still not 
precisely zero. (The probability of a score of zero is mathematically 
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greater, (.8)'°°, but still is practically zero.) In this problem, however, 
only the high end of the distribution is of concern. A score below what 
would be expected by chance has little, if any, psychological meaning. 


PROPORTION OF CASES 


3 
0 1 2 VALUES OF X 


FIG. 121. BINOMIAL DISTRIBUTIONS FOR 2 = 6 


It is apparent that since any score can occur by chance, some standard 
must be established for regarding an event as something not happening at 


hazard. Conventionally, two levels of probability-improbability have been 
most frequently used, depending on the choice of the investigator: the .01, 
ог 1 percent level; and the .05, or 5 percent level. In this example we 
arbitrarily choose to regard a score as deviating significantly above chance 
Performance if by hazard it could happen only once in 100 times; that is, 
it attains the .01 or 1 percent level of significance. 

The defined distribution of chance scores, as generated by the binomial, 
can be taken as normal. With p less than .5, пр is 20, which is considerably 


vc = и чоя SNOLLNSIULSIC 1VIWONIS сагъ 514 съ =ч 803 SNOLLNSINISIC TWIWONIA “чугу "ела 


X 30 S3MWA 
02 6t 


X 30 S3MWA 


X 40 S3nvA 
X 30 S3nv^ 
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greater than the 5 that experience has shown as the figure above which the 


fit of the normal distribution to the binomial is good. 
From Table A (Appendix), it is deduced that 1 percent of the scores of a 


normal distribution lie above 2.323¢ above the mean. On the hypothesis 
of chance performance, д„, by Formula 12.2, is 20, and o,, by Formula 
12.3, is 4/100 х .2 x.8, or 4. The value of 2.3236 is therefore 9.3, and the 
point above which 1 percent of the scores would be expected to fall by 
chance is (20 + 9.3), or 29.3. Accordingly, a score of 30 would be attained 


by chance less than once in 100 times. 
Of course the mere improbability of a score on the chance hypothesis 


does not demonstrate that the performance is the result of knowledge of 
the subject matter. The explanation of the performance must come from 
Sources other than statistical analysis. Also, the description of the proba- 
bility-improbability of a score does not in itself indicate anything of the 
degree of any characteristic that the variable may reflect. How much 
knowledge is represented by a score of 30, or any other score, must be 


Studied by test development methods. 


THE POISSON DISTRIBUTION 

A chance distribution for which values are relatively easy to find and which 
is often useful in psychological research is the Poisson, a function described 
by Poisson in 1837 (7). Although not difficult, its derivation is beyond the 
Scope of this presentation. Like the binomial, the distribution is dis- 
Continuous, comprising discrete variables. Relative frequencies of the 
variable X, from j down to 0, may be given as 


Ж f 
P 
j Ne~a ji 
3 
3 Ne. 
3! 
2 Мега 2i 
a 
1 Ne^* Tl 
0 Ne~a 


In the preceding values, М is the total number of cases and e is 2.71828..., 
the base of the natural or Napierian system of logarithms, a constant of the 
curve. An important characteristic of the Poisson is that the mean and the 
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variance are equal, and in the expressions for the frequencies given above, 
either the mean or the variance may be taken as о. 
Indicating the mean as и, the probability of any value of X is 


X 
pet (X =10, 1,2 =) (12.7) 


To obtain the frequency, the probability is multiplied by М, as in the 
algebraic distribution on page 309. The sum of the probabilities is, as 
usual, 1.000. 

From Formula 12.7 it can be noted that the successive terms of this 
exponential series, beginning with X = 0, are, as probabilities: 


2 p? 
р= е7", ePm e ED e (12.8) 

Any term corresponding to X, the value of the variable, can be obtained 
from its predecessor in the series by multiplying by д/Х. When the mean 
is known, any number of terms may be readily found. While there is no 
mathematical limit to the number of terms in the series, characteristically 
they soon become so small as to be negligible. 

Formula 12.2 for the mean of a binomial series, и = np, applies to the 
Poisson. Actually, when л is very large and p approaches zero, the Poisson 
yields good approximations to binomial frequencies. The p, obtained by 
Formula 12.7 can be used to estimate p, by Formula 12.1. The advantage 
is in ease of computation. 

Among other applications, the Poisson has been used in the study of 
accidents. If accidents are connected with individuals purely by chance, and 
if there are fewer accidents than people, then the distribution of the 
frequencies of accidents by individuals can be expected to follow the 
Poisson. In this case the Poisson can be regarded as a convenient substi- 
tute for the binomial, (p + 4)", with n, the number of independent events, 
considered to be large, and with p, the probability of an accident during 
any single event (or " exposure"), as very small. 

The goodness of fit of an empirical distribution to the Poisson can be 
tested with 52, exactly the way the fit of a set of observed frequencies to 
the normal distribution is tested. The one difference is in the number of 
degrees of freedom. As noted in connection with Example 11.3, df in 
fitting a normal curve is the number of categories less three, one df being 
lost for N, another for M, and a third for s. Since for a Poisson и = a”, only 
2df are lost, one for N and another for M, which is taken to represent zip. 
The usual restriction with 72 holds, namely, that no f, be less than 5. 
Any category with an expected frequency less than 5 must be combined 
with one or more other categories to build up the f, to 5 or more. Example 
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12.1 illustrates the process of fitting a Poisson to observed data and 
testing the goodness of fit by means of y?. 


EXAMPLE 12.1 


TESTING THE FIT OF A POISSON TO AN OBSERVED DISTRIBUTION 


From N, XX and XX? of the observed distribution, the mean (.370 in this case) 
and the variance (.369) are found by regular procedures. The fact that the mean 
and the variance are practically identical leads one to believe that the Poisson 


may be taken as the underlying distribution. 
TABLE 12.1. TESTING THE FIT OF A POISSON TO AN OBSERVED DISTRIBUTION 
Data from O'Gorman and Kunkle (6) 


NUMBER OF Npe or H-I 
ACCIDENTS fo Pe fe (fo—fe) fe 

3 6 .006 6 0 .000 

2 47 .047 45 2 .089 

1 242 .256 245 —3 .037 

0 662 .691 661 1 .002 


Epe = 1.000 


Computations 
-f 
N=957 p-rd — 128 
е 


УХ = 354 
XX? = 484 = 
Mz = .370 ар 


Sz? = .369 Р> .50 


From a table of e~™, such as Table E (Appendix), the value of е 379 is found 
to be .691, which, in accordance with Formula 12.7, is the proportion of cases 
for which the expected value of X is 0. (The value of e7™ may also be found, 
but not quite so conveniently, from any table of natural logarithms. It is the 


reciprocal of the antilogarithm of M.) | 
_ Subsequent proportions are found by multiplying the proportion correspond- 
ing to the next lower value of the variable by // Х, as indicated in Formula 12.8. 
Thus, ру is (.691)(.370), or .256; ps is (.256)(.370)/2, or .047; and ps is (.047) 
(.370)/3, or .006. The value of ps is (.006)(.370)/4, ог less than .001, and subse- 
Quent terms in the series are clearly negligible. 

Chi square can be computed either as Х(/0//:) — N or as Xfo — fe)? fe. With 
a x? of .13 and two degrees of freedom, it is found from Table C (Appendix) 
that P is greater than .50. Hence the difference between the observed distri- 
bution and the Poisson can be regarded as a highly likely chance occurrence, 
and the Poisson can be considered as an excellent fit to the data. 

Of course the fact that the data fit the Poisson does not prove that in this 
sample, the accidents were actually distributed among people by chance. Rather 
it shows that the hypothesis of chance distribution is not refuted by the facts. 
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In those instances in which ? is significant and the Poisson does not fit the 
data, further investigation would be required to ascertain the cause of the 
difference, which might be factors within individuals (“accident proneness"), 
or differential exposure, or even a systematic difference within the criterion 
itself. 


DISTRIBUTIONS OF STATISTICS: x? t, AND F 

As already noted, the binomial and Poisson distributions are mathematical 
models that may be applied to observed frequency data. Not only do some 
sets of observed measurements take normal form, but also, as will be 
discussed further in Chapter 13, the expected distributions of certain 
statistics are normal. 

There are three important distributions that apply to statistics rather 
than to observed measurements: chi square, applicable to squared differ- 
ences between frequencies; the г distribution, applying to differences 
between statistics (particularly means); and F, a family of distributions of 
ratios of variances. 

Mathematically, these distributions are interrelated, both with one 
another and with other theoretical distributions, including the normal. 
Derivations and mathematical relationships are presented by numerous 
authors, including Fisher (2), Kendall and Stuart (4), Adams (1), and 
Lewis (5). 

Applications of y? have already been presented in Chapter 3 and in 
connection with testing the fit of the normal and Poisson distributions to 
observed data. After discussion of certain general aspects in this chapter, / 
will be encountered again in Chapter 13 and F in Chapter 14. 


THEORETICAL AND EMPIRICAL DISTRIBUTIONS OF x? 


Applications of y? involve the comparison of frequencies actually obtained 
(observed frequencies, designated as f,) with frequencies anticipated under 
some hypothesis (theoretical or expected frequencies, designated as /.). 
As defined by Formula 3.1, 


"em A 
к=з fe т; и 


The following characteristics of 72 may be noted: 


1. For each f, there is a corresponding f,. 

2. у? is the sum of computations involving pairs of f, and /,. 

3. Since frequencies are always positive, and since differences between 
Ja and f, are squared, each contribution to y? is zero or a positive 
quantity, and the у? for any set of data is zero or positive. 

4. When there is no difference between each f, and its corresponding Ses 
x? is zero. It varies without upper limit. 
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5. x? isa pure number rather than a number representing units of measure- 
ment. The magnitude of a value of 3? obtained under defined conditions 
can be interpreted only in terms of the probability of its occurrence. 

6. As y? increases from zero to infinity the probability P of its occurrence 


decreases from 1.00 to 0. 


Actually, chi square is distributed as the sum of the squares of v inde- 
pendent values, in z form, v being the number of degrees of freedom. This 
would appear reasonable in that each component of y? has a format 
(f, — FY lfa, which resembles the square of a z score with the format of 
(X — M,)?/s,2. Just as (X — М.) is the distance above or below the mean 
value (which might be considered the expected value and which is evaluated 
in relation to s,), so (f, — f) is the difference between obtained and expected 
frequencies, and the square of this difference is evaluated in reference to f,. 

Theoretical tables of 72, such as Table C (Appendix), are based on a 
continuous mathematical function, the formula for which is given in 
advanced texts, including Lewis (5). The distribution of all possible values 
of x? that could be found from, say, a 2 x 2 table with limited № and with 
fixed marginal frequencies is, of course, discrete, with many values not 
appearing at all in the distribution. Nevertheless, it is useful to use a 
theoretical, continuous distribution to evaluate the rarity of an obtained y^ 

The general nature of the у? family of curves may become clear through 
the development of a set of empirical curves, using the 7? basic principle, as 
illustrated in Example 12.2. The principle is that the variable defined as 
the sum of the squares of n uncorrelated z scores is distributed as у^ with 
n degrees of freedom. 

For the required z-score values, 
from a table of the random normal deviat € 
squares of these values would be expected to approxim 
tion with Idf. 

Since pairs of these values shou 


samples of 100 lines each were selected 
e. An adequate sample of the 
ate the у? distribu- 


Id be uncorrelated, sums of pairs of 


squares of these values should be distributed as x with AP Sums of four 
Squares would be expected to follow the distribution of y? with Adf, and 
So on. In Example 12.2 and Table 12.2 the curves derived from entries ina 
table of the random normal deviate yield good approximations to the 


Corresponding y? curves. 


EXAMPLE 


COMPARISON OF THEORETICAL AND OBTAINED x? DISTRIBUTIONS 


Using the Rand tables (8), empirical distributions for one, two, four, and six 
degrees of freedom were developed. In Tables 12.2 and 12.3 they are compared 
with corresponding distributions obtained from mathematical functions. The 


12.2 
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particular degrees of freedom were selected for ease in computation and as good 
examples of 7? curves. 


TABLE 12.2. COMPARISON OF THEORETICAL AND OBTAINED xy? DISTRIBUTIONS 
FOR 1, 2, 4, AND 6 DEGREES OF FREEDOM 


X? Distributions for 1, 2, 4, and 6 Degrees of Freedom 


df=1 df=2 df—4 df=6 

x Pe Po ре Po ре Po Pe Po 
> 16 - — ss ery .001 2% .004 .010 
15-15.999 x sas en - .002 РА .007 .020 
1414.999 m ass хк н .003 ке .009 .010 
13-13.999 — ass .001 svi .004 sis .013 .020 
12-12.999 xax 855 .001 mas .006 .010 :019 .000 
1111.999 aes ave .002 sé .009 .010 .026 .000 
10-10.999 э БЕ 003 Pad .014 .010 .036 .070 
9-9.999 .001 = .004 ғ 021 010 .049 .020 
8-8.999 .002 .010 .007 .010 .030 .030 .065 .070 
77.999 .003 .000 .011 -000 .044 .060 .083 .090 
6-6.999 .006 .020 .020 .020 .063 .030 .102 .070 
5-5.999 O11 .000 .032 .010 .088 .090 121 .080 
44,999 020 .030 .053  .080  .119 140  .133  .160 
3-3.999 (038  .060 .088 100  .152  .160  .132  .190 
2-2.999 .074 .070 — .145 .130 .178 .170 111 .130 
1-1.999 .160 100 239 1220 174 .180 .066 .050 
0- .999 683 .620 .393 420 090 100 .014  .010 


TABLE 12.3. COMPARISON OF PROPORTIONS OF x? EXPECTED TO EXCEED 
GIVEN VALUES (Pe) AND PROPORTIONS ACTUALLY OBSERVED TO EXCEED 
THOSE VALUES BY CHANCE (Ро) FOR 1, 2, 4, AND 6 DEGREES OF FREEDOM? 
4-1 4-2 4-4 4-6 
Ж Ro В PP BR БЫ БЮ Б Б 


1 4317 .380 .607  .580  .910 .900  .986 .990 
2 4157 190 366  .360  .736 .720 .20 .940 
3 83 1202 222 230  .558  .550  .809  .810 
4 046. .060 135 130  .406  .390  .677  .620 


5 025 030 .082  .050  .287  .250  .544  .460 
6 -014 .030 .050 .040  .199  .160  .423  .380 
7 008 00 .030 .020  .136  .130  .321 310 
8 05 .010 оз 010  .092 .070 .238 2220 


9 1003; .. D$ = .061 .040  .174  .150 
10 0022 .. 007 э 040 030 125 130 
11 en ға 004... 027  .020  .088 .060 
12 ма se 002 .. 017 .010 .082 2060 
13 002. 011 043 .060 
14 001 .007 030 .040 
15 001 .005 020 .030 
16 000 003 014 010 


@ Data аге the same as in Table 12.2. 
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To avoid bias, the table of random numbers was entered at hazard and the order 
of first encounter with the digits 1, 2, 4, and 6 were noted. This order was 
2-6-4-1. Accordingly, the empirical y? curve for 2df was computed first, fol- 
lowed by the curves for 6df, 4df, and 1df. 

The table of random digits was entered again and a second line of random 
digits selected at random. The first random digits of this line were 2-2-7-3, 
which was taken as the number of the line in the table of random normal deviates 
at which to begin computations. These “random normal deviates” are, of course, 
2 scores with population mean of zero and population variance of unity. In any 
subsample drawn from the table, some variation from the theoretical mean and 
variances is to be expected. The plan called for four sets of synthetic x? values, 


as follows: 


INCLUSIVE Z SCORES SQUARED X? CURVE 
ENTRY NUMBERS AND SUMMED APPROXIMATED 
2273-2372 First 2 columns 2df 
2373-2472 First 6 columns 6df 
2473-2572 First 4 columns 4df 
2573-2672 First column ldf 


Examples of the computation of the artificial y?'s follow: 


ENTRY , 
NUMBER "GAUSSIAN РЕМІАТЕЅ” = 2 SCORES Zz =x? df 
2273 747 —.284 64 2 
2274 491 1517 .51 

2373 794 —.794 —.979 —.523 —1.906 1.317 7.86 6 
2374 —966 1.051  —.928 —3.293  .871 —.356 14.63 6 
2473 -.952 —.065 —1.214 —1.369 4.26 4 
2474 .219 486 1.064 —.071 1.42 4 
; 02 1 
2574 Ee 17 1 


2574 412 


In Table 12.2 the hundred ys for each df: 2, 6, 4 bae ess been лы. 
to ; Қ ared with corresponding pe found from pu ished 
Praporkions sl end RE rtions are always values within indicated 


tables of lete y? curves. The propo * 
Р энд histograms (Fig. 12.2) rather {һап con- 


limits, so that corresponding graphs are 
tinuous curves. It is to be noted that the fit of the observed to expected propor- 


tions is close. 
In Table 12.3 both the pe and po have been cumulated toward 0 to find pro- 


Portions of y? expected to exceed stated values of 2. . . 

Table C in the Appendix, which gives the probability of exceeding tabulated 
values of y? for specified df, is in effect modification and extension of the type of 
information presented in Table 12.3. By interpolation, fair approximations of 
the 72 values in the body of Table C can be found. For example, for 6df, what 
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value of 72 is exceeded by 5 percent of obtained y? entirely by chance; that is, 
what is x? value corresponding to P = .05? 

From the Р, column for 6df in Table 12.3, it is apparent that the required 
value is between 12 and 13. By interpolation, 72 for P = .05 is 13 — (.007)/(.039) 
— 12.82. The .007 is what must be added to .043 to obtain .050, while .039 is the 
proportion of the chance y? in the step 12 to 13. In Table C the value of y? for 
6df and P of .05 is 12.59, which is reasonably close to the value of 12.82 found 
from Table 12.3. Both values are, of course, derived from the same mathe- 
matical function, and identity is to be expected. 


Theoretical 
df=1 


Observed 
df=1 
N=100 


Observed 
df=2 
N=100 


16 


OF 32. 4 GB 40 d2 МУ 16 DE wq 38. 10-42 44 


FIG. 12.2( a). THEORETICAL AND OBSERVED yx? DISTRIBUTIONS (HISTOGRAM FORM) 
(Data are the same as in Example 12.2) 


From Figure 12.2, based on the data of Example 12.2, it can be seen that 
the у? distributions for df = 1 are highly skewed. (The theoretical distribu- 
tion is plotted as a histogram to make it comparable to the obtained 
distribution. Had the theoretical distribution been plotted as a continuous 
mathematical function, it would have been higher at the left and more 
elongated at the right, and hence even more skewed.) For df = 2, the 
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theoretical curve is still highly skewed, but less so than for df = 1. For 
4df, the mode of the theoretical curve is in the interval between 2.00 and 
3.00, and skew is still further reduced. While the curve for df = 6 is not 
symmetrical, it appears that as the degrees of freedom continue to in- 


crease, normality might be reached. 


-60 .60 
.50 .50 
tical Observed 
-40 brands e .40 df=4 
N=100 


50 :50 

40 40 
Theoretical 

30 df=6 30 


4 6 8 10 12 14 16 


On Beg me EE qu "e al de 02 

FIG. 12.2(b). THEORETICAL AND OBSERVED x° DISTRIBUTIONS (HISTOGRAM FORM) 
2 

t theorem does apply, and any x? curve for more 

rmal. However, when df — 30, x? 

he normal distribution. Rather, a 


Actually the central limi 
than 30df is regarded as essentially no 


itself is not tested to find its place in t ; 
function of x? is used, a function for which the normal curve is a better 


approximation than it is for x° itself. Generally, this function is Уз, 
which has a mean of 2v — 1 and unit variance. A still more accurate 
approx ПЕ funotin 2  y?]v, with a mean of 1 — (2/9v) and 


Variance of 2/9v, v in all cases being the degrees of freedom. 
As far as y? itself is concerned, the mean is always v, (the number of df), 


the variance is 2v, and the mode (except for the curve for Idf) is v — 2. 
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RELATIONSHIP BETWEEN y? AND THE NORMAL CURVE 


Certain relationships between у? and the normal curve may now be 
summarized. Since the у? curve for 1df can be reproduced by squaring a 
normally distributed variable in z form, NP for 147 is distributed as the 
positive half of a normal distribution. 

A relationship already noted is that as degrees of freedom increase, the 
resultant distribution becomes a better and better approximation of the 
complete normal curve, centered around a mean of v. 

Since the mathematical derivation of y? involves the squares of normally 
distributed z scores, it is also clear why f, must not become too small. If 
an f, were permitted to become, say, 1, then the contribution to y? could 
be expected to be skewed. Another way of stating this is: х? assumes that 
the sampling of differences between the f, and f, follows the normal curve. 
On the other hand, no assumptions are made about the shape of the 
distribution of the f, themselves. 


OTHER CONSIDERATIONS IN THE USE OF x 


X^ is a statistic that corresponds to no parameter in the population. It 
is a pure number, evaluated only in reference to v (the number of df) 
and not to N (the number of cases). As the degrees of freedom increase, 
possible values of у? increase, and the size of the y? required for a fixed 
value of P also increases. 

In Table 12.3 are shown theoretical P’s derived from the mathematical 
X5 curves, together with P,'s obtained by cumulating the proportions of 
Table 12.2 toward zero. The relationship between the theoretical P’s and 
those derived empirically is close. 

For 1df through 30df, y? can be evaluated directly, and often with 
sufficient accuracy, from Fig. 3.4, which shows lines representing P = .10, 
P = .05, P = .01, and P = .001. The graph shows df as a continuous func- 
tion; however, df is always integral, since fractional v's never occur. 

As an example of the use of Fig. 3.4, take a y? of 20, with 10df. The 
intersection of the horizontal line for 10df and the vertical line for a 
X? value of 20 is between the curves for P = .05 and P = .01. Accordingly, 
05 > P > .01, and the hypothesis that provided the //5 in the computation 
of 2 is disproved at the .05 level but not at the .01 level. 

More frequently, tabled values of 7? are used instead of charts. It will 
be noted, however, that when Table C (Appendix) is entered for 10df 
and y? of 30, the conclusion is exactly the same as when Fig. 3.4 is used. 

The use of у? always assumes that the events or measures or observations 
that form the basis of the frequencies are independent. Thus, the frequencies 
must not represent the same group of subjects over and over again or even 
two or more observations on the same person. 
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х? has an important additive property. With all frequencies independent, 
two or more sets of observations in their respective categories may be 
regarded as a combined set of categories, with df equal to the sum of the 
df of the categories as originally grouped. Conversely, a у? over a number 
of categories may be partitioned into two or more 7's over subsets of 
categories. This flexibility is an appropriate deduction from the way 
theoretical y? curves are established as sums of squares of v normally 
distributed variables, v being the number of df. 


RELATIONSHIP BETWEEN x2 AND ф 


In the special case of a 2 by 2 diagram, there is a fixed relationship between 
X? and ф. Let interval and marginal frequencies be represented as follows: 


VARIABLE X 
a b (a+b) 
VARIABLE Y c d (c 4- d) 
(a+c) (b 4- d) N 


By formula 3.6, х2 = N[X(f7/f,f.) — 1]. Thus, in terms of frequencies, 


2 a? b? с? 
қ | ЕБ Y3*üb4 (ato 


4? 
toraa l 


This simplifies to 


" (bc — ad)* | 
pues (с F bya + cXb + 4)(с + d) 


which in turn, by Formula 9.16a for 4, simplifies to 
2 = Мф? (12.9) 


(5 

= |= 12.9; 
ф N (12.92) 
THE £ DISTRIBUTION 


Of considerable utility in psychological research is the distribution, 
Published in 1908 by William S. Gossett, who wrote under the pseudonym 
“Student” (9). Gossett’s discovery is often considered as the beginning of 
the modern era in mathematical statistics, with its emphasis on the use of 


or 
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exact sampling distributions and on making the best possible use of 
observed data for inferring parameters. 

As with 72, there is a separate ¢ distribution for each value of v, the 
degrees of freedom, and also as does y? (and some other theoretical 
distributions), the ¢ distribution approaches normality as a limit as v 
becomes indefinitely large. 

Each of the г distributions is conceived as the distribution of the ratio of: 


1. A numerator variable distributed normally with mean of zero; and 
2. A denominator variable distributed independently of the numerator 
and as the square root of у? divided by v. 


If z is a normally distributed random variable with mean of zero and 
unit variance, then the г distribution can be described as 


_ 2 

V xiv 

1 П2,2,2,,2,2, “аге random normal variables іп z form, the following 
distributions for / may be envisaged: 


t (12.10) 


For 1df, = zi 
2 
2, 
For 2df, {з = ——L__ 
jz tz 
2 
Zi 
For 3df, 1, = ——————— 
JE + 2,7 + ж? 
3 
Zi 
and for 4df, l4 = ———— 
fz tz +22 + Zn? 
| 4 


Inspection of these formulas makes it apparent why there is a separate 
t distribution for each value of v;. Since г; is symmetrical around 0, and 
since the denominator is always positive, t is also symmetrical around 0. 
Both, of course, vary from — оо to + оо, and it can be shown that the 
inflection points are at values + V v/(v + 2), while the variance isv/(v + 2). 

Although, for any value of t, the uncorrelated denominator z values do 
not necessarily sum to zero, it is apparent that the term J xlv has a strong 
resemblance to a standard deviation. Intuitively, г might seem to be an 
appropriate model distribution for a normally distributed difference 
divided by the standard error of that difference. 
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COMPARISON OF THEORETICAL AND OBSERVED í 
Table 12.4 gives the following information: 
1. The expected proportion of values of г within stated limits, as deter- 


mined from the table of г for v = 1; 

2. Actual frequencies of 100 observed values of 2/2; found by drawing 
pairs of values from a table of the random normal deviate and com- 
puting the ratios; 

3. The expected proportions of values of / within stated limits for v = 6, 
as found in a table; and 

4. Actual frequencies of 100 observed values of 24/1216. (Actually, the 
artificial y?’s tabulated in Table 12.2 were used as the values for x? 


with 6df.) 
TABLE 12.4. EXPECTED AND OBSERVED VALUES OF t (1df and 6df) 


y-l v=6 
EXPECTED OBSERVED EXPECTED OBSERVED 
VALUES OF f PROPORTION PROPORTION PROPORTION PROPORTION 
Above 5.5 .057 407 001 .00 
4.5 10 5.5 .012 02 .001 .00 
3.5 to 4.5 019 01 .004 .00 
2.5 to 3.5 .033 .08 .017 .01 
1.5 to 2.5 .066 .06 .069 .08 
.5 to 1.5 .165 47 225 .28 
—.Sto 5 .295 34 365 32 
—1L5to — .5 .165 12 .225 22 
—2.5 (0 —1.5 .066 06 069 08 
—3.5to —2.5 .033 .01 017 .00 
—4.5to —3.5 .019 01 .004 .00 
—5.5to —4.5 .012 02 .001 01 
Below —5.5  .057 403 001 .00 


vations to the theoretical distributions 
rical curves are symmetrical, and as 
h flatter than for v = 6. 

6 are too platykurtic to be con- 
оо, t is normal, 


In both cases, the fit of the obser 
seems reasonably good. Both empi 
expected, the curve for v = 1 is muc 


Both the curve for v = 1 and for v= 
sidered normal. Actually, it can be shown that when v = 


and it is practically so when v = 25. . . 
bution is useful in evaluating the value 


The model provided by the / distri e 
of a normally distributed variable (such as the difference between a sample 


mean and the parameter) by dividing it by an unbiased estimate of its 
sampling error. The ratio can then be interpreted by reference to the 


appropriate / distribution. Specific applications are discussed in Chapter 13. 


THE F DISTRIBUTION 
Somewhat more general in its application than the z distribution is the 
distribution of F, so-named? in honor of its discoverer, Sir Ronald Fisher. 
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If random normal variates аге 2, z, --- z, and z,, 2, ++: Zy, then F may be 
defined as 
2 
2р +2 +5 tz, 
Y 
= Z +z te + 2,2 


П 


y 


(12.11) 


From Formula 12.11 it is seen that: 


1. Fis always positive, with the possibility of values up to co; 

2. When and only when the numerator variable has a single degree of 
freedom, t = VF and F = Ps 

3. For every pair of df there is a distinct distribution; and 

4. The shape of the F distribution varies with the two values, v and v'. 


From Formula 12.11 it is also reasonable to believe, as is actually the 
case, that when v and у are both very large, the F distribution approaches 
normality. 

Probably the most important use of F is that it can be taken as the 
distribution of the ratio of two independent variances, each with its own 
df. It was the discovery of the F distribution that made possible the 
development of the analysis of variance, described briefly in Chapter 14. 


RELATIONSHIP OF F TO x? 


Examination of Formula 12.11 shows that the numerator of Fis distributed 
as д? with v degrees of freedom (v being a constant divisor), while the 
denominator is distributed as x with у df (v being the constant divisor 
in the denominator term). 

In terms of y?, Formula 12.11 becomes 


2 a.u 
poA xv (12.112) 
21, 2 
Xv V гу 


which shows that F is the ratio of two values of x^, each divided by the 
appropriate number of degrees of freedom. 


EMPIRICAL F DISTRIBUTION 


To develop an empirical distribution of F, the 100 artificial 776 with 6df, 
the distribution of which is in the last column of Table 12.2, were matched 
at random with the 100 artificial Х75 with 4df in the same table. The distri- 


bution of the resultant ratios, appropriately weighted by 6df and 4df, is 
shown in Table 12.5. 


? By G. W. Snedecor, who prepared the first tables of F. 
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Tables of the F function usually show, for each pair of df, the values 
beyond which are found 5 and 1 percent of the F by chance. 


TABLE 12.5. AN EMPIRICAL DISTRIBUTION OF F 


(v — 6; v —4) 

VALUES OF F y 
10 or more 2 
9-9.999 0 
8-8.999 1 
7-7.999 2 
6-6.999 1 
5-5.999 2 
44.999 2 
3-3.999 6 
2-2.999 7 
1-1.999 30 
0-0.999 47 
N= 100 


For 6df and 4df, 
5 percent level: F=6.16. 
1 percent level: F=15.21. 


From Table F (Appendix) it is seen that with 6df for the greater mean 
Square and 4df for the lesser, 1 percent of the F by chance would be 
beyond 15.21 and 5 percent beyond 6.16. In the empirical chance distribu- 
tion, 2 percent of the F are 10 or more, and 6 percent are 6 or more 
so that the empirical distribution is probably not greatly different from 
the one that would be developed mathematically for this particular pair of 


df. 


SUMMARY 


Among the theoretical distributions that observed distributions sometimes 
approximate are the normal (discussed in Chapter 11), the binomial, and 
the Poisson. The normal distribution may be developed as a continuous 
and limiting case of the binomial, and the binomial may be approximated 
very closely by another discontinuous distribution, the Poisson. 

In the testing of statistical hypotheses, а general strategy is to evaluate 
the rarity of a specific empirical finding in terms of the distribution of the 
Statistic developed on the basis of a probability model. Both the normal 
distribution, described in Chapter 11, and the binomial are often used for 
this purpose. Three other distributions, which vary with the degrees of 
freedom and which are applicable to statistics rather than to the obser- 
vations on which statistics are based, are x^, t, and F. 

X? сап be developed as the sum of v squares of random z scores; / can 
be developed as a random z score divided by the square root of y,?/v; 
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and F, as the ratio of y,?/v to y,?/v'. Both y? and F are always positive. 
With few degrees of freedom, у? and F are highly skewed, but both 
approach normality as the degrees of freedom become larger. The distribu- 
tion of 1 is symmetrical around zero. When v — 1, it is very platykurtic, 
but it becomes more peaked as the degrees of freedom increase, eventually 
becoming normal. 


EXERCISES 


. A subject іп a perceptual experiment is required to say whether two stimuli 
presented simultaneously are the same or different, and his responses are 
considered correct or incorrect. What is the chance expectation of five or 
Six successes in six attempts? 


2. A short classroom quiz consisted of 5 five-choice items. If 100 students were 
to take the test, and success on each item were completely at random, what 
would be the distribution of scores? 


3. Prepare a table showing the mean, mode, and standard deviation of the chi- 
Square curves from df — 2 to df — 20. 


4. In a problem with 42df, y? = 112.5. Is у? significantly different from zero? 


ы In а perceptual learning experiment, three of the five possible choices on each 
trial were considered correct. If a subject had 59 successes in a run of 96 trials, 


Should his performance be considered as definitely better than could be 
expected by chance? 


. From Table 12.3 the data for expected and observed values of 1 for 6df may 
be consolidated as follows: 


VALUE OF / fe fo 
Above 1.5 9 9 
.5 to 1.5 23 28 
m Stos 36 32 
—1.5 to —.5 23 22 
Below — 1.5 9 9 


By the 7? test of goodness of fit determine whether the observed values of the 
artificial depart significantly from the expected distribution. (Assume v = 4.) 


Р For N —50, ф = 20. By the y? test, can the association between the two 
variables be considered significantly different from zero? 


8. The following data from Jones (3) represent the frequency of aircraft accidents 
experienced by 2546 pilots during a four-year period. 
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NUMBER OF 


ACCIDENTS f 
5 1 
4 3 
3 13 
2 71 
1 422 
0 2036 


Fit a Poisson to this distribution and apply the x? test of goodness of fit to 
test whether the hypothesis of the data following the Poisson is tenable. 
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ROLE OF DESCRIPTIVE STATISTICS IN GENERALIZATION 


Descriptive Statistics, computed from particular samples, provide a basis 
for making inferences about the populations the samples are taken to 
represent. The scientific psychologist has as his primary interest the estab- 
lishment of sound generalizations, that is, the discovery of principles that 
apply not only to the sample that he has investigated but also to the 
Population, from which, theoretically at least, samples not yet observed 
can be selected. If the generalization is correct, it will apply to new samples 
not used in its development. This chapter is concerned with statistical 
Procedures used in making valid inferences about the population even 
though direct knowledge is always limited to samples. 

It will be seen that knowledge of the population is, in general, not 
certain, but rather probable. Some collections of observations are worth- 
less for making generalizations because of inappropriate techniques for 
establishing the sample or inadequate numbers of cases, or faulty obser- 
vations, or improper statistical manipulations. On the other hand, by 
stating assumptions and procedures very carefully, by refining the means of 
making the observations, and by extending the number of properly 
selected cases, it is often possible to make statements about the population 
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with a high degree of probability, sometimes amounting to virtual cer- 
tainty. 


DEVELOPMENT OF REPRESENTATIVE SAMPLES 

Occasionally, as in certain aspects of the U.S. Census, sample and popu- 
lation coincide. More commonly a relatively small group is used to 
represent all cases of interest, and statistics computed within the group are 
used to infer characteristics of the population. Three methods used to 


develop representative groups are: 


1. Random sampling; 
2. Stratified sampling; and 
3. Area sampling. 
every case in the population must have equal 
Probability of being selected at each step in developing the sample. 
Procedures designed to accomplish this vary with whether the population 
is finite and knowable or whether it is considered unlimited or even 
infinite. With a finite population all members may be numbered and then 
some sort of mechanical device or table of random numbers used to 
select the sample. If individuals are already arranged in an order that can 
be considered random and a subsample of N ' cases is needed from the total 
sample of М cases, then each (N/N’)th case may be used, provided bias is 
avoided in selecting the initial case. 

When there is a relatively unlimited pool of cases from which to draw, 
more elaborate procedures may be required, such as using a table of 


random numbers to decide the interval between each successive case 


drawn from an unending and naturally occurring sequence. 
n is divided in categories on one or 


In stratified sampling, the populatio 
more variables of interest, and then random samples are taken from each 
of these categories in such a way that the sample has the same proportions 
by categories as the population has as a whole. Stratified sampling is 
often used in conducting polls to forecast voting behavior, and the cate- 
ividuals voted in a previous election, 
-rural classification, and the like. 


In random sampling, 


tendencies. These may be how ind 


Occupation, economic status, urban шет 5 
A variant of stratified sampling is called quota sampling, in which the 


subsamples are not randomly determined, but sampling is continued until a 


certain number of cases in each category is obtained. | 
In area sampling, the total population Is broken down into smaller units 


and a random sample of the units is chosen. Thus the census of a country 
might be taken by laying off the entire area in small sections and choosing a 
random sample of these areas for a precise count. Undoubtedly, the 
Sample would be improved by a mixture of stratified and area sampling in 
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which areas were first classified by categories and then sections chosen 
within each of the categories. 

Many different sampling plans, some of which cannot be classified too 
clearly, can be used to develop a group from which characteristics of the 
population can be inferred. Whatever the precise details of constructing 
the sample, it must meet three requirements: 


1. The sample must represent the population, that is, it must be unbiased; 

2. It must be large enough so that the statistics computed on it will be 
reliable; and 

3. It must be small enough so that data collection and subsequent analysis 
will be efficient. 


Size is no guarantee of the quality of the sample. If a small sample is 
representative, it may be more efficient than a large one. 


ESTIMATING PARAMETERS FROM STATISTICS 


A statistic computed from a sample is frequently used to estimate a 
parameter, the corresponding value in the population. 

In theoretical distributions, parameters are often known, but parameters 
Corresponding to statistics in observed psychological data can only be 
estimated and have no exact numerical values. The fact that a statistic is 
used to estimate a parameter can be indicated by the symbol = (“equals 
approximately”), Thus, if c, = 12.5, we know that the value of 12.5, 
obtained from 5, in a sample, is taken as an estimate of a parameter. 


BIASED AND UNBIASED ESTIMATES OF PARAMETERS 


Some of the statistics computed from samples are the best available 
estimates of the corresponding parameters. Examples are p, the proportion 
of cases falling in one of two categories, and M, the mean. These statistics 
may be incorrect estimates for various reasons, such as poor sampling 
techniques, inadequate numbers of cases, or faulty measurement pro- 
cedures, but they exhibit no bias or systematic distortion. 

In the case of the mean, it is readily seen that in a truly random sample 
of cases drawn from a normally distributed population, observed scores 
X units above the parameter и, and observed scores X units below и, are 
equally likely to be selected. Hence, there is no tendency for M, to have 
a predetermined relationship to и,, and М, can be taken as an unbiased 
estimate of џ,. Actually, M, is an unbiased estimate of и, whether or 
not the variable is normally distributed in the population. 

As contrasted with p and the mean, the standard deviation and the 
variance in the sample are biased in that they are underestimates of 
corresponding parameter values. By mathematical methods, it can be 
shown that, on the average, an appropriate allowance can be made for the 
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bias by using (N — 1) as the divisor instead of N. When N is large (say, 
greater than 30), the correction is inconsequential, but with small numbers 
of cases the correction is essential for making an unbiased estimate of the 
parameter. It is to be noted that (N — 1) is actually the number of degrees 
of freedom in a frequency distribution with a fixed mean. (In effect, the 
mean must be determined from the sample data before it is possible 
to find the variance.) The formulas for unbiased estimates of parameter 
values of the variance and standard deviations are 


; xx 
sgi ш 13; 
Й. = oy N-i (13.1) 
and 
xx? 
ES 13:2 
бұ ксі (13.2) 


DISTRIBUTION OF A STATISTIC 
As indicated in Chapter 11, every statistic theoretically has its“ distribution,” 
defined as the array of values that would be found if the statistic were 
computed for each one of a series of samples of the same size drawn 
completely at random from an unlimited population. A primary endeavour 
of modern mathematical statistics has been the discovery of these 
“sampling distributions” or of satisfactory approximations to them. In any 
given instance there is no way of knowing whether the obtained statistic 
is above or below the parameter value. However, if it can be assumed that 
the sampling procedure has been correct, a knowledge of the statistic 
and its distribution often permits valid and useful inferences about the 


Parameter. 
In this connection a set of val 
Properly developed sample, is thoug 


ues of a statistic, each computed from a 
ht of as varying around the parameter. 


The theoretical relative frequencies of this fluctuation constitute the 
" distribution of the statistic," a mathematical construct by means of 


which a statistic obtained from a particular sample may often be evaluated, 
statistics are the normal, y^, the 


The most important distributions of ] Ў | 
t distribution, and F, all of which have been discussed. This chapter is 
concerned with statistics distributed normally, such as means based on 


samples drawn from a normal population, and with those that follow the 
t distribution, such as the difference between two means divided by s. 
Certain applications of the F distribution are presented in Chapter 14, 
These distributions, as pointed out in Chapter 12, are interrelated and 
also have relationships with other theoretical distributions, including у. 
Before using one as the distribution of a statistic, it is always necessary to 
examine the conditions under which the statistic was obtained, to determine 
whether the proposed mathematical model is actually applicable. The 
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form of the distribution is not necessarily a fixed characteristic of a statistic, 
since it may vary with the distribution of the variable or variables in the 
parent population, with the magnitude of the parameter value, with N, 
and with v (the number of the degrees of freedom). The central limit 
theorem is often crucial in this connection, since as numbers of cases or 
numbers of summed variables increase, there is a tendency for a nonnormal 
sampling distribution to become normal. 


CONCEPT OF THE STANDARD ERROR 


The distributions in common use are always based on mathematical func- 
tions that are chosen either because of theoretical considerations or be- 
cause the function has repeatedly Been found to give a good fit to observed 
data, or for both reasons. With the exception of the binomial (when n is 
small) and the Poisson, direct numerical methods for finding the relative 
frequency at various values or for a band of values of a theoretical distribu- 
tion tend to be too complex for routine usage. This is a chief reason for the 
extensive use of tables in statistical analysis. Thus, when it is known that a 
Statistic is distributed normally, tables of the normal curve can be used in 
evaluating an obtained value. 

The true standard deviation of the distribution of a statistic is itself a 
parameter which, with empirical data, cannot be known. Nevertheless, an 
estimated standard deviation of the sampling distribution, called the 
standard error, can be found for a wide variety of statistics and is especially 
useful with those computed for large samples of cases. The standard error 
of the mean is a good illustration of the concept. 

Consider a parent population in which the variable is highly skewed. 
If a series of samples of one case each were drawn at random from this 
population, we should expect to find the distribution of these samples to 
approximate the distribution of the variable in the parent population. 

Now consider the distribution of sums of two cases, each drawn from 
the Same population. Because any extreme case would be probably coupled 
with a case not so extreme, the shape of the distribution of these sums 
would be less skewed than that of the original variable. In fact, the only 
Way їп which the shape of the distribution of sums would repeat the shape 
of the original distribution would require that all pairs of cases be formed 
by coupling adjacent values. By random sampling, such a series of events 
is so highly improbable that it may be dismissed as impossible. The shape 
of the distribution of means would, of course, be exactly the same as the 
distribution of sums. (Division by a constant N would affect absolute 
numerical values, but not the shape. In this case the constant would be 2.) 

With each subgroup forming the sum (or mean) selected completely at 
random from the parent population, the skew would become still less in 
samples of three cases each, and in general would continue to diminish as 
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the number of values in each sample became larger, eventually becoming 
more or less normal. What is true for skewed populations is true also for 
populations departing in other ways from the normal. The point at which a 
set of means becomes “normal” depends, of course, on the criterion of 
normality, on how the variable is distributed in the parent population, and 
on N (the number of cases in each sample). It seems reasonable to regard 
the mean as distributed normally when either of the following conditions, 
or both, apply: 


1. The variable is distributed normally in the population; or 
2. The number of cases in the sample is large. 


STANDARD ERROR OF THE MEAN 


By methods of mathematical statistics it has been found that the standard 
deviation of an array of means derived under either of these conditions is 


б, 
oy == 13.3 
M JN ( ) 
in which буу is read as the “standard error of the mean," c, is the parameter 
standard deviation, and N is the number of cases in each sample. The 
formula appears appropriate. When N = 1, ом = Cx, and as N increases, 
см decreases. It is also reasonable that см should vary directly with o,. 
Since c, is an unknown parameter, it is conventionally estimated by 
Formula 13.2. Accordingly, the standard error of the mean, designated as 
5м and estimated from the sample standard deviation, is 


i= 

N-1 5 

= = == (13.3a) 
5м JN JN 


iability of means around the 


It is to be noted that см gives the var 
parameter и, not around the sample mean M. Consequently, we cannot 


Say, since 95 percent of a normal distribution lies between the limits of 
+ 1.966, that there are 95 chances in 100 that the true mean lies within 


3-1.96s,, of an observed mean. In general, however, the smaller the sy, 
the parameter. In a later section a 


the closer the observed mean will be to Ba. 1 
method involving M, Sm, and t will be given for establishing fiducial, or 
confidence, limits within which и is very likely to be found. 


OTHER EXAMPLES OF STANDARD ERRORS 

Standard errors have been discovered for most of the descriptive statistics 
in common use. Just as a standard deviation can be meaningfully computed 
for a variable that is not normally distributed, so the existence of a formula 
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form of the distribution is not necessarily a fixed characteristic of a statistic, 
since it may vary with the distribution of the variable or variables in the 
parent population, with the magnitude of the parameter value, with N, 
and with v (the number of the degrees of freedom). The central limit 
theorem is often crucial in this connection, since as numbers of cases or 
numbers of summed variables increase, there is a tendency for a nonnormal 
sampling distribution to become normal. 


CONCEPT OF THE STANDARD ERROR 


The distributions in common use are always based on mathematical func- 
tions that are chosen either because of theoretical considerations or be- 
cause the function has repeatedly been found to give a good fit to observed 
data, or for both reasons. With the exception of the binomial (when n is 
small) and the Poisson, direct numerical methods for finding the relative 
frequency at various values or for a band of values of a theoretical distribu- 
tion tend to be too complex for routine usage. This is a chief reason for the 
extensive use of tables in statistical analysis. Thus, when it is known that a 
Statistic is distributed normally, tables of the normal curve can be used in 


Now consider the distr. 


the Same population, Bec 
With a case п 
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the number of values in each sample became larger, eventually becomin 
more or less normal. What is true for skewed populations is true also bx 
i. ce departing in other Ways from the normal. The point at which a 
жй. becomes normal depends, of course, on the criterion of 
E ee on how the variable is distributed in the parent population, and 
"wis e number of cases in each sample). It seems reasonable to regard 
an as distributed normally when either of the following conditions 
or both, apply: " 


а The variable is distributed normally in the population; or 
- The number of cases in the sample is large. 


S 
TANDARD ERROR OF THE MEAN 


B А ИРЕЕТ 
e D of mathematical statistics it has been found that the standard 
lation of an array of means derived under either of these conditions is 


E (13.3) 
VN 

ndard error of the mean,” с, is the parameter 
ber of cases in each sample. The 
= l, см = су, and as N increases, 
м should vary directly with oy. 

s conventionally estimated by 
f the mean, designated as 


Cu = 


е см is read as the “sta 
eae deviation, and N is the num 
еа es appears appropriate. When N 
mage It is also reasonable that ом: 
нчи 6, is an unknown parameter, it i 
runs à 13.2. Accordingly, the standard error of the m 
estimated from the sample standard deviation, is 


= 
N- 5 

= —— ———— (13.3a) 
SM JN JN Ld 


bility of means around the 


4. Consequently, we cannot 
lies between the limits of 


It i ; А 
бага is to be noted that см gives the varia 
Meter u, not around the sample mean 4 


s Р 
a trei) 95 percent of a normal distribution z ШОБ 4 
E196 6, that there are 95 chances in 100 that the true mean lies within 
the "i Sm of an observed mean. In general, however, the smaller the sy, 
ШЕ, oser the observed mean will be to the parameter. In a later section a 

od involving M, у, and г will be given for establishing fiducial, or 


с 
onfidence, limits within which p is very likely to be found. 


OTH 
е ER EXAMPLES OF STANDARD ERRORS 
ta ipti 
in ш errors have been discovered for most of the descriptive statistics 
for топ use. Just as a standard deviation can be meaningfully computed 
Variable that is not normally distributed, so the existence of a formula 
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for a standard error does not imply that the statistic itself is normally 
distributed. For the standard error to be useful in making inferences, 
knowledge of the shape of the distribution is essential. A nonzero corre- 
lation coefficient (which is not normally distributed) can be converted to a 
function that is more or less normally distributed and which has a known 
standard error, so that a table of the normal curve can be used in evaluating 
the observed statistic. f 

In reading reports of research, standard errors can always be inter- 
preted as the expected standard deviation of the statistic if it were 
computed from a series of samples of the same size drawn from a popu- 
lation. Generally, this population is considered to be unlimited. If, however, 
the population is finite and consists only of N’ cases, then the standard 
error is somewhat less than when the population is infinite. The correction 
factor, by which the usual formula is to be multiplied, is (N' — N)/(N' — 1) 
in which N' is the number of cases in the population and М is the number 
of cases in the sample. When N’ is large compared with N, the correction 
is negligible. 

Descriptions of various standard errors are available in advanced texts. 
Here, only a few need be mentioned. 

The standard error of the median is somewhat larger than that of the 
mean. Hence, it can be said that the mean is more “reliable” than the 
median. If the sample is large, the median is distributed approximately 


normally, and if the sample is from a normal population, the standard 
error is 


Oman, = 1.253 = (13.4) 


In estimating б, the parameter б, is replaced in practice with the 


estimated population value of the standard deviation, JIEN — 1). 
The standard error of a proportion is 


в | (13.5) 


which indicates that since the 
the observed 


of f and д. 


parameters f and (1 — р) or à are unknown, 
Proportion p and its complement q may be used as estimates 


STANDARD FRROR OF r 
The standard error of a correlation coefficient is given by 


1-22 


ИЗ ТЕС (13.6) 
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in which ? is the parameter and N is the number of cases measured on the 
two variables. As Р departs from .00, the distribution becomes more and 
more skewed, so that the normal curve becomes less and less applicable. 
Also, it is never appropriate to substitute the obtained r for ?, the unknown 
parameter. 

A special case of Formula 13.6, however, is often very useful. For an f of 
-00, the formula becomes 


1 


On = == (13.7) 
J/N-1 


When N is large, r’s computed from samples drawn from a bivariate 
normal population, in which f is zero, tend to follow a normal distribution. 
Accordingly, multiplication of an obtained correlation by JN — 1 yields 
the number of standard error units that it falls above or below an assumed 
Parameter value of .00. The result can be readily evaluated from a table 
of the normal curve, such as Table A (Appendix). 

As a numerical example, consider an r of .25 computed from 
The value of /N — I is 8 and 1/УМ- 1 is -125. When .25 is divided by 
125, the result is 2.00, showing that the obtained r is 2.00 standard errors 
above a hypothetical parameter value of .00. (Since division by a reciprocal 
Шум — 1], is numerically the same as multiplying by the number 
IVN- 1], an identical result is found as r/N — 1, or .25 x 8 = 2.00.) 

From Table A it is seen that the proportion of cases in a normal distribu- 
tion between the mean and 2.00ø is .477. Accordingly, the proportion of 
cases above 4-2.00 is .023. Hence, it can be said that there are 2.3 chances 
in 100 that an ғ of .25 or more could be found for a random sample of 65 


drawn from a population in which ? = .00. 


65 cases. 


LEVELS OF SIGNIFICANCE OF Г | | 
tatistics, including correlations, is 


A method of evaluating many obtained s ng cc > 
In reference to a ind enit which is chosen arbitrarily, Бр which 
15 helpful in determining whether or not а statistic compute rom a 
Sample should be considered a chance deviation from a posited parameter 
Value. Since exceedingly improbable events can occur fortuitously, even a 
Perfect correlation might conceivably be found in a sample, say, of 65 
Cases drawn at random from a population in which f? = 00. | 
The practice is to determine a point on the theoretical distribution of the 

Statistic, beyond which is found a certain percentage of the distribution, 
Usually 5 or 1 percent. This is then taken as the dividing point between 
Statistics that are “significant” at that level and those that are not. 

_ When a statistic is known to be normally distributed, any desired 
Significance level can be established from a knowledge of the standard 
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error and the use of a table of the area of the normal curve. Tests for 
normally distributed statistics are illustrated graphically in Fig. 13.1. 


+ 2.3260 -2.5750 +2.5750 
а =.01 (one-sided test) а = .01 (two-sided test) 
+1.645т - 1.9600 +1.9602 
а = .05 (one-sided test) а = .05 (two-sided test) 


FIG. 13,1. STATISTICAL TESTS FOR NORMALLY DISTRIBUTED STATISTICS. 
(«-.01 АМО а = .05 FOR ONE-SIDED AND TWO-SIDED TESTS.) 


When the observed statistic, in terms of standard errors, falls in the 


shaded area, the null hypothesis may be rejected at the stated level of 
confidence. 


VALUES OF Z SCORES FOR ONE-SIDED AND TWO-SIDED TESTS 


P VALUE 
ONE-SIDED TWO-SIDED 
Z SCORE TEST TEST 
2.575 .005 010 
2.326 010 020 
1.960 025 050 
1.645 050 .100 


In the particular instance, the standard error is .125 and the theoretical 
distribution is known to be approximately normal. From Table A it is 
seen that 49 percent of a normal distribution lies between the mean and 
+2.326о. The value of 2.326 standard errors is 2.326 x .125, or .291, which 
can be taken as the division point between coefficients, based on a sample of 
65 cases, that are "significant at the 1 percent level" and correlations 
that are not. Since the obtained coefficient in the example is .25, it is not 
significant at the 1 percent level. It is, however, significant at the 5 percent 
level. Since 45 percent of a normal distribution lies between the mean and 
-- 1.6456, it follows that 5 percent of the cases in a chance distribution 
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fall 1.645 stand: i 
eel c а а н a ine 
.125, pected that 5 percent 
of the r would fall above .206. Hence, the obtained r can be considered to 
be significant at the 5 percent level. 
Қ. ы example the hypothesis is to the effect that there is no association 
den en the two variables in the population. This has been considered 
zm oved only if the r in the sample is high and in a stated direction (in this 
h > positive). In psychological research, this is the usual way in which a 
Ше a correlation is investigated, since there is generally 
е anticipation of the direction of association. 
an the logic of the situation calls for disproof of the hypothesis, whether 
а 15 positive or negative (provided, of course, its absolute 
bun is sufficiently high), then the procedure involves a “two-tailed” 
rather than the “one-tailed” test demonstrated above. 
ane to Table A again shows that 49.5 percent of a normal distribu- 
дЫ: between the mean and 2.5750. Accordingly, 1 percent of the 
"^ Dution lies outside the limits +2.5750. For the hypothesis of no 
А ociation to be disproved at the 1 percent level of significance under a 
Қы test, an observed r (based on a sufficient number of cases 50 
lie o he assumption of normal distribution of the f? of .00 is plausible) must 
ee the limits of +2.575 standard errors and —2,515 standard 
is m Table A it can also be seen that 2.5 p! 
the we + 1.96о (and a similar percentage be 
1.96 gi its for the 5 percent on the **two-tal 5 
in А errors above ог below .00, the hypothesis of no as 
er direction is disproved at the 5 percent level of significance. 
" ae N is not large (customarily considered as less than 30), the use of 
iden. of a normal curve for evaluating whether or not an obtained r is 
= cantly different from zero involves appreciable error. The reason 1s 
a the distribution of sample r’s (when # = .00) approaches normality 
Y as N becomes large and is perfectly normal only when N is infinite. 


ercent of a normal distribution 
low — 1.966). These, then, are 
led” basis. If r is more than 
sociation 


TH 
E 2 DISTRIBUTION APPLIED TO 7 
Ado | — ; 
ui function of r and М follows the / distribution with (N — 2) degrees 
ch Teedom precisely, provided N is 3 or more. For r based on small samples, 
^ бм of this function is essential, and it is conveni 
T more. This is an application of the г distribution described in Chapter 


12. The function is 
N-2 
a (13.8) 


———. 


t- 
d 


336 AN INTRODUCTION OF PSYCHOLOGICAL STATISTICS 


To evaluate an obtained r, t may be computed by Formula 13.8 and its 
significance found from any table of t, such as Table T (Appendix). More 
conveniently, however, a table of the significant points of the function, 
such-as Table R, is entered directly with the degrees of freedom, which for 
zero-order r are (N — 2). 

For a partial r, the number of degrees of freedom is decreased from 
(N — 2) by 1 for each variable partialled out. For a multiple R, (N — 2)is 
reduced by 1 for each variable beyond the first used as a predictor to find 
у, the degrees of freedom. When Table R is used, only the line entered is 
changed; if t is computed, Formula 13.8 becomes 

rJ v 
=———— 13, 
t Aa (13.8а) 
in which v is the number of degrees of freedom. 
Correlations significant at the 1 and 5 percent levels for various degrees 


of freedom are graphed in Fig. 13.2. This graph is usable for one-sided 
tests in which the correlation is to be tested for the presence of association 


DEGREES OF FREEDOM 
100 


90 
80 
70 
60 
50 Area in which 
ris significantly 
40 different from .00 


(1 percent level or better.) 


20 Area in which 

r is of doubtful 

10 Significance or 

not Significantly 
different from .00. 


30 40 50 60 .70 80 90 100 


VALUES OF r 


FIG. 13.2, CORRELATIONS SIGNIFICA 
LEVELS FOR VARIOUS S NT AT THE 1 PERCENT AND 5 PERCENT 
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ey direction only. The information is exactly the same as in two columns 

ы. ‘able R. For routine usage the table is probably to be preferred, since 
E е to the significance levels are easier to evaluate. ' 

" sing Table T and Fig. 13.2, the r of .25 based on 65 cases, which was 

жаы tested by the formula for the standard error of an ? of zero 
ay be readily retested for significance. ' 
Using 63df, 

ov 25/606 _ 
JA-8 уг 0297 


FISHER'S 2, TRANSFORMATION OF г 


1 


oro above, r may be regarded as normally distributed when N is 
denn = 200; when f = .00 and Nis not specified, a function of rand v 
oe e distribution of t corresponding to v. Appropriate procedures 
мане, ys described Гог testing whether or not an observed r can be 
of si нге a chance deviation from a parameter f of zero at a fixed level 
baies cance. These procedures. are not useful, however, for testing 
iet r an observed r differs significantly from a posited nonzero para- 
e r value, nor for testing the significance of the differences between two 
ewed r's, 
е... ау is that only w 
чена and only when ? = 
onor y. The higher the absolute value o 
in sample r's becomes. To remedy th 
mor of r, which, for all pract 
of the е While the value of z, varies, of course, di 
istribution of z, is independent of the magnitu 


hen ê = .00 are sample r's distributed sym- 
100 and М is large are they distributed 
ГР, the more skewed the distribu- 
is situation, Fisher (1) developed 
ical purposes, is normally 
directly with r, the shape 
de of r. The formula 


is 
1 1+r 1+7 
21 = LY. 13.9 
2, = 5108. 1—; 1.1503 logio т, (13.9) 


d in Table Z (Appendix), which permits 


V. 
alues of r and z, are presente 
eadily than does the use of 


© : 
M from r to z, and from z, to r тоге г 
vary m a 13.9. As will be noted from Table 2, r and 2, UP to .25 do not 
ami ore than .005. When r = .50, z, is .55, and thereafter the divergence 
ues to become progressively greater. For negative values of r, the 


Si n 

Еп of z, also becomes negative. 
ST 

^NDARD ERROR OF Z, 


T 
he standard error of z, is 
1 
(13.10) 
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It is to be noted that there is no advantage in converting an r to z, in 
order to test whether its divergence from a posited f of .00 can be attributed 
to chance fluctuation at a given level of confidence. The / test is always 
applicable, and it is often permissible to divide the obtained r by the 
standard error of a zero r(s,,) to find the number of standard errors the 
observed correlation is above or below the posited parameter value of .00. 

The previous example of an r of .25 based on 65 cases can, however, be 
readily tested through the use of 2,. The z, value is .255 and , is 1/7.874, 
both slightly greater than the corresponding r and s,,. Multiplication of 
.255 by 7.874 (numerically equivalent to dividing .255 by 1/7.874) yields 
2.01 standard errors, which, by a one-sided test, is significant at the 5 
percent level but not at the 1 percent level. 

A more exacting comparison of the z, technique is with a large r based 
on only a few cases and evaluated through the use of both 7 and 5» 
From Table R it may be noted that an ғ of .75 with 7df (М = 9) is signifi- 
cantly different from .00 at the .01 level of confidence. Table Z shows that 
the z, value corresponding to an r of .75 is .973, while by Formula 13.10, 
3, is 1//9 — 3 = 1/2.4495. Multiplication of .973 by 2.4495 yields the 
fact that the obtained z, is 2.38 standard errors above zero. Table A 
indicates that the proportion of cases in a normal distribution above 
2.380 is .0087, which would show that the ғ of -75 is significant at the 
1 percent level. (There is a slight discrepancy between the two techniques, 
since in the normal curve, the dividing point between the lower 99 percent 
and the upper 1 percent is approximately 2.33. However, the agreement 
should be considered excellent.) Other illustrations of testing the signifi- 
cance of a correlation are given in Example 13.1. 


EXAMPLE 13.1 
REE) 1958 _. 
TESTING THE SIGNIFICANCE OF AN OBTAINED r 


In Table 6.3 a correlation of .56 is reported for a sample of 20 cases. If the 
Parameter value were .00, what would be the expectation of a positive r of .56 
9r more, merely by sampling? 

This can be evaluated by Formula 13.8a and Table T as follows: 


t rV v VI м = .56V18/V1 — (56)? = 2.38/.83 — 2.87 


If Table T is entered with 18df, it is seen that the expected Proportion of г” 
occurring by chance beyond the limits of +2.552 is .02. A one-sided test is 
appropriate because in analyzing the relationship between variables, the inves- 
tigator typically has information as to its direction. Accordingly, since .01 of 
chance ?’s are greater than +2.552, the t of 2.87 indicates that the r is significantly 
different from .00 at better than the .01 level. 


The same conclusion can be readily reached by inspection of Fig. 13.2. 
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Another method of testing whether the divergence from .00 is greater than 
can reasonably be attributed to chance is through the 2, transformation. By 
Table Z, the obtained r of .56 is converted to a zr of .633. The standard error 
of 2, is independent of its size, and by Formula 13.10, is ПУМ — 3 or 24. The 
obtained z, is .63/.24, or 2.63, standard errors above а zr of .00. From Table А 
it is seen that in a normal distribution, the proportion of cases between the mean 
and 2.630 is .495. Since the .5000 cases are above the mean, the expected pro- 
Portion of cases above 4-2.636 is .0043, and again r can be taken as significantly 
different from .00 at better than the .01 level. | 

When N is large (say, 50 or more), ап г may be tested for divergence from zero 
by Formula 13.7. Consider an r of .20 based upon N = 50. Here sr,= Uv. 49, 
and rs, = 20 + 1/7 —.20 x 7 = 1.40. Reference to Table A shows that in a 
normal distribution, the proportion of cases above 1.40c is (.5000 = 4192), ог 
0808. Accordingly, ғ is not high enough to reach the .05 level of significance. 
(To reach the .05 level in a one-direction test, the r would have to be 1.645s;,; 
and for the .01 level, 2.3265...) 


APPLICATIONS ОЕ THE Z; TRANSFORMATION 


While the z, transformation is not needed for testing the departure of an 
Observed r from an f of .00, it has three important uses: 


l. In setting limits, on the basis of an observed r, within which f is likely 


to be found; 
In testing differences between r’s; and 
In averaging r’s from different samples so as to o 
Tepresentative of several coefficients. 


btain a single value 


The first two applications will be discussed in later sections. 


When considered as covariances of z scores (standard scores with mean 
of zero and variance of unity), r's may be added to find a covariance «d 
volving a sum variable. However, r's as such cannot be added or average : 

When two or more r's involving the same variables have been ea 
from different samples, the most rigorous technique for finding the 
Average of these r’s involves the following steps: 


l. Convert di (Table Z). 
each r to the corresponding 2, қ 
2. Correct each 2 by subtracting a correction term, r/[2(N — 1)]. Strictly 


; < i known. 
Speaking, this should be f/[2(N — 1)], but fis un 
i Multiply each corrected z, by the number of its degrees of freedom, 


(У — 3). 
Obtain the sum of these results. i 
3, Divide this sum by the sum of all the degrees of freedom. This results 
6 1n a mean corrected 2, value. 
d By the use of Table Z, convert this 2, 10 an 7; 
Sidered the mean r over all the samples. 


which may then be con- 
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The use of corrected z, is important when №” are small, and weighting 
by df is important when N’s vary widely from sample to sample. 

If N’s are large and do not vary greatly from sample to sample, a 
simple mean of the z,’s is ordinarily sufficient. 

Somewhat more approximate, but reasonably accurate, is the procedure 


of taking the quadratic mean of the correlations as the representative r. 
Thus 


2 
ма [E (13.11) 


n 
in which n is the number of r’s. If all r’s are low (say, .30 or less) little 


violence will be done if they are averaged directly, since in that case all 
r’s are almost identical with corresponding z,’s. 


THE NULL HYPOTHESIS AND TESTS OF SIGNIFICANCE 


The testing through the use of 5 ОГ f or 5, of whether an obtained co- 
efficient of correlation varies from a posited parameter value of zero more 
than could be expected by chance is a concrete example of the use of what 
is called, in Fisher’s terminology, a null hypothesis. 

A null hypothesis need not be simply that f is zero, nor that there is a 
difference of zero between two parameters (such as means), although 
frequently it is either of these. A null hypothesis is far more general. It is 
a statement about one or more parameters (or characteristics of the 
population) cast in such form that probabilities of divergence from the 
hypothesis in defined and randomly selected samples can be precisely 
Stated. This, of course, requires knowledge of the applicable distribution 
function, which most generally is the normal distribution, or 1, or F, but 
which could be some other chance distribution, such as the binomial. 

In all cases a null hypothesis assumes that in specific samples there are no 
differences from the posited parameter value, except those differences 
resulting purely from chance. 

A null hypothesis can never be proved because, if no differences are 
found in one or more experiments, the possibility will always remain of 
discovering à difference in a subsequent study. However, a null hypothesis 
can be disproved, not absolutely, but at a certain level of significance, 
often denoted as a, the Greek letter alpha, which is written as a proportion. 
In the earlier discussion of the detection of association between two 


variables, two levels of significance are used, the 5 percent level (а = .05) 
and the 1 percent level (x = .01). 


STEPS IN TESTING A HYPOTHESIS 


The use of the null hypothesis 


: permits a logical sequence in the develop- 
ment of inferences about a po 


pulation parameter. The steps are: 
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1. Development of a hypothesis that if disproved would help to establish 
the research hypothesis; 

2. Selection of an acceptable level of significance in the study; 

- Collection of measurements or other observations on a sample con- 

Sidered a representative of the population; 

4. Decision, based on the known sampling distribution of the statistic 
employed, as to whether the observations (summarized as a statistic) 
are in accordance with the null hypothesis or whether, at the level of 
Significance already selected, it has been disproved; 

5. If the null hypothesis can no longer be accepted, evidence has been 
found for an alternate view, which eventually may come to be regarded 
as established. 


The null hypothesis, then, is an integral part of the design of the study. 
It may or may not assume an experimental effect. It must, however, be 
exact and specific, and it must provide a basis by which all possible out- 
Comes can be evaluated, usually by a table showing critical points of the 
sampling distribution of the statistic. The choice of « (the level of signifi- 
Cance required for the hypothesis to be regarded as disproved) depends on 
the risk involved in an erroneous decision. BEEN 

The place of the observed statistic in the sampling distribution is 
denoted as P, its probability. Figure 13.1 shows graphically the difference 
between one-sided and two-sided tests. If a two-sided test is required by 
the logic of the study, the function has to be cumulative so as to include 
both ends of the chance distribution. " 

The probability (or P) that a given statistic can occur only by chance i 
the null hypothesis is true is, of course, between 00 and 1.00. If it happens 
that an observed statistic has a very small Р, there are two alternatives, 
either that the hypothesis is false or that a rare event has повеке 
In general, if there is little likelihood that the specified statistic a: E 
Occurred by chance, the hypothesis is rejected. If P is large, the nu 
hypothesis is accepted, pending further evidence. 


Ww 


ALTERNATIVES IN STATISTICAL TESTS 
ilities in statistical 


Fisy Р ically the four possib 
gure 13.3 shows diagrammatically he decision of the 


tests, The hypothesis is either correct Or incorrect; t 
investigator is to accept it or reject it. М 
In two of the four cells of the figure, the decision 15 rein қ 
The situation when the hypothesis is correct but is rejecte is known as 
Type I error, or “error of the first kind.” A Type 1 error is a calculated 
Tisk because, for those cases in which the hypothesis is correct, its proba- 


bility varies directly with and is actually 9, the selected level of significance. 


In the instances in which the hypothesis is correct, and in which о is, say, 
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.05, 5 percent of the statistics computed on comparable samples would be 
expected, on the basis of chance also, to meet the criterion for rejection. 
Accordingly, 5 percent of the decisions for the instances in which the 
null hypothesis is correct would be in error. 


FIG. 13.3. ALTERNATIVES IN STATISTICAL TESTS 


ALTERNATIVES ALTERNATIVES AS TO DECISION 
AS TO HYPOTHESIS REJECT ACCEPT 
Hypothesis is correct Type I Error Decision is correct 


Hypothesis is incorrect Decision is correct Type II Error 


When the null hypothesis is in fact incorrect, the probability of making 
the correct decision (that is, to reject the hypothesis) is known as the power 
of the test. The acceptance of the null hypothesis when it is incorrect is 
kriown as Type II error, or “error of the second kind.” 

In the conduct of research, major attention is often given to increasing 
the power of the test applied to the statistics summarizing the observations. 
If о is increased, power is increased; but this step also increases the Type I 
error. In general, the methods of reducing Type II error are to use a more 
homogeneous population, to increase sample size, to decrease random 
errors of all kinds, and to seek to increase the experimental effect. 


EVALUATION OF DIFFERENCES BETWEEN MEANS 


The goal of a social science research study is often the discovery of a differ- 
ence between the means of two groups measured on the same variable, or 
between the means of the same group on the same variable under different 
conditions. 

With hardly an exception, pairs of observed means do differ, some 
fractionally, some by a considerable number of points on the scale that is 
being used. The fact that two sample means are not identical is never, in 
itself, of scientific interest. The question is always whether the two under- 
lying parameters differ. In addition, there is often a question as to the 
size of the difference or of its practical importance. These questions would 
require analyses beyond those described here and might be answered in 
terms of a correlation or the proportion of a variance (both evaluated as 
to reliability) or in reference to a null hypothesis that a difference in the 
population was not more than a certain amount. 

Here the null hypothesis is that the observed difference is no greater than 
could be expected by chance. The level of significance, o, is selected with 
reference to the consequences of erroneous decisions. Routinely, pre- 
cautions are taken to reduce errors, whether of observations or induced by 
variation extraneous to the problem of concern. Methods of sampling, of 
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experimental control, and of measurement all need to be considered in the 
design of any study. Here the discussion is limited to procedures for 
evaluating the reliability of a difference, that is, for determining whether 
at a stated level of significance (2) the observed difference between two 
sample means reflects a difference between two parameters. It is always 
assumed that proper controls have been imposed on outside sources of 
variance. 


UNCORRELATED AND CORRELATED MEANS 


The statistical test of the reliability of a differen 
whether the samples are small or large, and acc 


means are uncorrelated or correlated. 
Two sample means cannot be * correlated " in the sense that a correlation 


Coefficient can be computed between them. The variables on which the 
means are based may, however, be correlated if each value in one set is 
Coupled with a value in the other set. This, of course, requires that the N 
for the two means be identical, either through the use of the same subjects 
Or by some pertinent linkage, such as father-son or mother-daughter. 
Two sets of means representing variables correlated in the population 
Would of themselves have a correlation in that in successive pairs of samples 
they would tend to vary together. This is the meaning of the concept of 
Correlated means, or of other correlated statistics such as variances or 


Correlation coefficients. 


ce varies according to 
ording to whether the 


DIFFERENCES BETWEEN UNCORRELATED MEANS 
Since M, is normally distributed if the sample is drawn from a normal 
Population or if N is large, it is reasonable to believe (what is actually the 
case) that the difference between two means (M, = Mz) is normally 
distributed around the parameter, (uy = Ha)» To evaluate such an obtained 
difference, the sampling distribution of an unlimited series of differences 


based upon samples drawn at random from two unlimited populations 
Must be known. The theoretical standard deviati 


on of such a series of 
differences is the standard error of the 


difference of the means, which may 
be indicated as Sai ОГ SMi- Mz There are two forms of this standard 
error: one when the means are uncorrelated and another when they are 
Correlated. 


For uncorrelated means, 
— 5: н A 
5м.-м. 7 Vsus! + Sas 
dard error of the mean by Formula 


d as 1 and 2. 
mally distributed populations (or 
difference (M, — М) is normally 


(13.12) 


n Which sẹ? is the square of the stan 
“За and the two groups are designate 
"eus two means are drawn from nor 

ased on large numbers of cases), the 
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distributed. If the population variances, сү? and o;?, are known, then 
(M, — М%/8м,-м. is normally distributed with unit variance. On the 
hypothesis that there is no difference between the two means, this ratio of 
the difference divided by the standard error of the difference can be 
evaluated directly from a table of the normal curve (Table A, Appendix) 
either as a one-sided or a two-sided test, depending on the logic of the 
situation. 

The reason why (M, — M3)/Sm,-m, has unit variance is, of course, 
because Sw, _ у, is the standard deviation of the distribution of the differ- 
ences between means. If the parameter difference is zero, then any obtained 
difference is in z score form. 

An important fact is that when sy, м; is estimated from the variances 
of small samples rather than found from population variances, the ratio 
(M; — М./8м,-м, is no longer distributed normally but follows the : 
distribution with an appropriate number of degrees of freedom. If the 
population variances can be assumed to be equal, as is ordinarily plausible, 
then the number of degrees of freedom can be taken as (М, + М, — 2), 
and Sm, -m, is computed as follows: 


N,V, + Nata) (NS a) 
= 13.12 
#му-м„ E *N-2/X N,N, (3.128) 


As stated in Chapter 12, the z distribution becomes a closer and closer 
approximation to the normal as the number of degrees of freedom be- 
comes large. Accordingly, with large samples (say, 30 or more in each 
вгоир) Sm, -m, can be estimated as 


— NZX -XY NAX; -ELY 

Р NC ^E 2 2 ES 1 4 1 2 2 2; 

M-m = Узы + "Ms J М, (М, = 1) М; (М, – 1) 
(13.12Ь) 


In this situation the difference between the means divided by 5м,-м, 
as found from Formula 13.12b can be taken as normally distributed, and 
the hypothesis that the parameter difference is zero can be tested through 
the use of the normal distribution. 


The testing of differences between uncorrelated means is illustrated in 
Example 13.2. 


EXAMPLE 13.2 


TESTING THE SIGNIFICANCE BETWEEN TWO UNCORRELATED MEANS 

Definition. Two means are * 

involve cases that can in no w 
based on the same variable. 


uncorrelated" if the two sets of observations 
ay be matched. Ordinarily, the two means are 
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_ Small Sample Method. An obtained difference between two means (Mi— М) 
divided by Ss, -Ma as found by Formula 13.12а can be tested by the г distri- 
bution. When samples are small (30 > N), the use of г is considered essential. 

An arithmetical example is given below: 


M=8 Мі-17.5 y; =2x12/Ni = 7.0 
Na —10 Мз = 9.4 Va -Ххе?/ Мз = 6.5 


| Мі- Ма Mi — Ms 


t = ———— = Oo 
SM, — Mg 2 ми + NeV2\ (№ + 
M+ Ма--2 №№ 


9.4 —7.5 = 46 


etur ee Et) oce erc 

К (8 x 7.0) + (10 x 6.5)\ (8 + 10 

( 8+10—2 Yi) 

Evaluating the Obtained t by Table T. In testing the significance of the difference 

between two means, a hypothesis as to their relative direction generally exists. 

If a conventional teaching method has been used in group 1 and a novel technique 

has been tried out in group 2, the introduction of the new procedure implies the 

expectation that it may be better than, rather than merely different from, the 
old. Hence the “one-tailed” test of the null hypothesis is appropriate. 

Table T (Appendix) includes instruction as to how it is to be entered for 


“one-tailed” and “two-tailed” tests. In the example given above, у = Ni + № 
72; that is, there аге 18df for which the critical points from Table T are 


TEST .05 LEVEL .01 LEVEL 


One-tailed 1:73 2.55 
Two-tailed 2.10 2.88 


Since г = 1.46, it clearly does not meet the requirement of significance at, say, 
hed prior to the statistical 


the .05 level (which, of course, would be establis 

analysis), 
“Significance” is greatly affected by the number of cases. When N’s are 

large, a relatively small difference may serve to disprove the null hypothesis. 
With large N's the standard error of the difference of Formula 13.12 can 
© written 


== Vi үз 
8М1-Ма = Узм + sm = N-—1 is Ма-і1 


with № = 80 and Ne = 100, but with 
The difference divided by the stan- 
ed the critical ratio) may 


Consider the numerical example above, 

d and variance remaining unchanged. Н 

‘ard error of the difference (a statistic sometimes call 
© written as 


346 AN INTRODUCTION TO PSYCHOLOGICAL STATISTICS 


Mi — M2 _ кый = 1.9 483 
$Mj — Mg 70 65 393 

b с 

79 99 


Both Table А and Fig. 13.1 show that a value of 4.836 is so far above a para- 
meter mean difference of zero that the null hypothesis is refuted at better than 
the .01 level of confidence. 


DIFFERENCES BETWEEN CORRELATED MEANS 


A frequent problem in educational and psychological research is the 
evaluation of a series of differences of pairs of scores, using the hypothesis 
of no difference between the means as the hypothesis to be tested. In this 
Case the means are correlated and Formulas 13.12, 13.12a, and 13.12b for 
5м, -м, do not apply. 

The simplest procedure is to find the difference (D) between the two 
Scores for each case, summing these differences with regard to algebraic 
sign to find E D, and summing the squares of the differences to find £ D2, 

Then, for small samples, 


M,-M;  ED/N-1 


t= = n (13.13) 
5му-м, — V NED? — (ED) 


in which N is the number of differences, which may be evaluated as t with 
(N— 1) degrees of freedom. 


When N is large, г may be taken as z and evaluated from a table of the 
normal curve. 

The standard error of the difference between correlated means may be 
obtained from a modification of Formula 13.12, which includes a corre- 
lation term for samples that are not independent: 


2 2 
5м-м = V su, — 2ғ25м,5м,-- 5м, (13.14) 
This formula is sometimes useful, as when pairs of scores are not 


immediately available. The use of both formulas is illustrated in Example 
13.3. 


EXAMPLE 13.3 
ee EE 153 0 
TESTING THE DIFFERENCE BETWEEN CORRELATED MEANS 
Sanford (7) reports the following recognition distances in centimeters for 


lower case letters for two observers H and J. Differences, D, and squares of 
differences, 2°, are also shown. 
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OBSERVED OBSERVED 
VALUES DIFFERENCES VALUES DIFFERENCES 
ите Н J D D LETER OH J D D 
a із 12 1 1 n 15 14 1 1| 
b 16 16 0 0 о 13 13 0 0 
c з Il 2 4 p 16 16 о о 
d 16 14 2 4 q 16 16 0 0 
е 12 8 4 16 г 15 15 о о 
f 15 17-2 4 s B 12 1 І 
g 14 16 -2 4 t 14 м 0 0 
h 16 14 2 4 u 15 d3 2 4 
i 13 13 0 0 v 16 17 -1 1 
j 16 17 —1 1 w 7 17 о 0 
k 16 14 2 4 x 16 dS 1 1 
l 15 13 2 4 y 16 19 —3 9 
m з 18 о 0 z 13 13 0 0 


The two sets of values are matched by letter and are correlated. Accordingly, 
tests for correlated means are appropriate. For the test by Formula 13.13, the 


following constants are needed: 
N26 ZD-1l р? = 63 


The algebraic sign of the differences is arbitrary. Interpretation of the result 
Would be unchanged if signs were reversed and УР became —11. Taking ED 


as positive, 
EDVN-1 55 р 
VNXD:-(XD) V1517 


А If 3 D were taken as negative, t would be — 1.41. A two-tailed test is required, 
Since the appropriate null hypothesis is that there is no difference in either 
direction in the two population means. Here the “population” is regarded as 
One of letters. Let us establish in advance the 5 percent level as the arbitrary 
level of significance that we shall accept as indicating a real difference in the 
Means of the two observers. To evaluate the obtained difference (in this case, 
11/26), we need to estimate what proportion of 78 would fall outside the limits 
of +1,41 and —1.41 in an unlimited series of samples, if there were no difference 

tween means, The expected proportion of such г is taken as the P value, or 
Probability, 

The number of degrees of freedom is 1 
Observations. and in this case is 25. From Table T it is seen that for 25df, a t 
ЁТеа{ег than 1.41 or less than —1.41 would be expected more than 10 percent 
Of the time; that is, Р > .10. Accordingly, a difference between the means cannot 


© Considered to have been demonstrated. | 
Formula 13.14 yields a similar interpretation, 
follows: 


less than N, the number of pairs of 


but requires computations as 


Xx; =377 XXuX; = 5699 


ХХи —388 
r=.79 


Уһ? = 588 УХ)? = 5613 
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By Formula 5.1, 


388 
Mu эб 14 
377 
= — = 14.50 
M; 26 


By Formula 5.82, 
su = ds У 26 x 5848 — (388)? = 1.49 
зу = 35 V26 x 5613 — (377) = 2.37 
To estimate the standard error of the mean, Formula 13.3 may be written as 


Sz 


su 


VN-1 


in which sz is the sample standard deviation rather than the parameter as esti- 
mated by Formula 13.2 Accordingly, sa, = .30, and sar, = .47. 
By Formula 13.14 the standard error of the difference between the two means is 


Sq-m, = Мәм; — 2гнл5м „SM, + 5м? 


= М(.30)° — 2(.79)(.30)(.47) + (47)? = .30 


An obtained difference divided by its standard error is sometimes called a 
critical ratio. When df is less than 30, as in this case, it must be interpreted as 
a t. Hence, 


Ud 14.92 — 14.50 = 42 EU 
30 30 
which is approximately the same as that obtained more simply by the direct 
method demonstrated above. In practice, the formula involving r is used when 
means, standard deviations, and the correlation are known, but when original 
observations are no longer available. 


TESTING DIFFERENCES BETWEEN CORRELATIONS 


If the correlation between two variables is computed in two independent 
samples, the question as to whether the obtained difference between the 
r can be attributed to chance can be tested through the use of the z, trans- 
formation (Table Z) and the normal distribution. Since the difference 
between any two normally distributed variables is itself distributed nor- 
mally, and since the distribution of z, is approximately normal with a 
standard error of ./1/(N — 3), it is necessary only to: 


1. Convert both the r to Ze 
2. Find the difference between the ss. 
3. Divide this difference by s 


24-723* 
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4. Evaluate the ratio by means of a table of the normal curve. On the 
condition that both samples have been randomly selected from the 
same population, the ratio (z,, — z,,)/s., -., has a mean of zero and unit 
variance. 

The formula for the standard error of the difference between two such 
correlations may be written as 


—Àj;— —3 1 1 N,+N,—6 
ГЕТ РЕ mom E 1 2 
1722 Vsa +5, [3:3] Iss eg 


the difference between two 
and with one variable in 


(БЕ 


А very different situation is the testing of 
Correlations obtained on the same sample, 
common. In Formula 13.16, this common variable is denoted as variable 3. 
By the procedure about to be described, it is possible to test whether, for 
example, the obtained correlation of a vocabulary test with measured 
Scholastic achievement differs significantly from the obtained correlation 
of a verbal analogies test with the same criterion. 

Here the г test applies, t being found from 


pp т 
N-3 
р SS E 13.16 
Керн ЖЕНЕ a 


riables are r12, "13, and T23, 
le correlation between the 


other two. 


in which the correlations among the three va 
and where R32) is the square of the multip 
common variable and the weighted sum of the 
The : found from Formula 13.16 is evaluated with a table of / (such as 
Table T), with (N — 3) being the number of degrees of freedom. 
The use of these procedures for evaluating differences between r's is 


illustrated in Example 13.4. 


13.4 


EXAMPLE 


TESTING DIFFERENCES BETWEEN CORRELATIONS 


Two Samples. The following correlations 
ta were reported for two samples 
by Table Z (Appendix) are shown 


x Difference Between the r in 
И ween the Stanford Binet and ће Army Ве 
World War I recruits (8). Conversions to zr 


elow: 
SAMPLE N ly er 
Camp Dix 93 744 9594 
102 649 7138 


Camp Jackson 
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Since 2, is regarded as normally distributed, a difference between z;’s is 
divided by the standard error of the difference (as given by Formula 13.15) and 
the resultant ratio is evaluated in terms of the normal frequency distribution. 
The null hypothesis would be that there is no difference in either direction 
between parameters represented by observed r’s. Accordingly, a two-tailed 
test of significance is appropriate. This requires a D/sz,-z, of 1.96 or more for 


significance at the 5 percent level and of 2.58 or more for significance at the 
1 percent level. 


By Formula 13.15, 


M+ Ns — 6 erm s 


146 
(№ — 3) (№ — 3) 90 x 99 


Sz,-2, 


Accordingly, 


D _ (9594 — .7736) 
Sez, 146 


= 1,27 


which is less than the 1.96 required for significance at the 5 percent level. (It can 
also be seen from Table A that 1.276 falls within the middle 80 percent of a 
normal distribution.) The conclusion is that an r of .744 based on 93 cases and 
an r of .649 for 102 cases cannot be regarded as significantly different. 

The Difference Between Two r in the Same Sample. As reported in the same 
Source (8), E. K. Strong secured ratings by superiors for the intelligence of 313 
men in three national guard companies. Intercorrelations of three variables, the 
oral directions and memory span subtests of Group Examination a (a fore- 
Tunner of the Army Alpha) and the ratings, are as follows (N = 313): 


2. 3. 


1. Oral directions E 47 
2. Memory span 34 
3. Rating 


In advance of computations, let us decide that a difference significant at the 
5 percent level Will lead us to reject the null hypothesis of no real difference 
between the ғ. Also, it is to be noted that a two-tailed test would be called for 
in the absence of a hypothesis as to the direction of a difference between the ғ. 
In Formula 13.16 for t, the quantity (1 — R2,,) is зл», a second-order 


partial Variance that can be found by the method for finding multiple R demon- 
Strated in Chapter 7 as follows: 


1 2 3 
1 1.0000 -5300 4700 
2 5300 1.0000 -3400 
3 4700 1.0000 
7191 -0909 
1264 7791 


V3.12 = .7676 
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Then, 


N-3 


t= 2 3 
(ris — rea) 2(1 — riz) (1 — Каз) 
^— 310 — 


2.69 
(47 — 39 ах ле 


By Table T, a value of 2.69 for 310 df is significant at slightly better than the 
І percent level. Since a г of 1.97 would have been sufficient to cause us to reject 
the null hypothesis, a difference іп the r's may be considered to be demonstrated. 

Of course, as with significance tests in general, a statistically significant dif- 
ference in r’s may or may not have important practical consequences. However, 
if either the oral directions test or memory span test is to be used as a predictor, 
the former is the better choice in this case. 


DEPARTURE OF A STATISTIC FROM A HYPOTHETICAL VALUE 


When used for research, which aims at the discovery of general principles 
applicable to populations, obtained statistics are estimates of correspond- 
ing parameters. As discussed earlier, some statistics, such as the mean, are 
“unbiased” in the sense that they exhibit no systematic warping in either 
direction, the descriptive statistic being the best possible approximation 
9f the parameter obtainable from the particular sample. Other statistics, 
Such as the variance, require a correction when used as estimates of popu- 
lation values, In all cases, however, the statistic, as computed or as slightly 
Modified, is a single value that constitutes a “point estimate” of the 


Corresponding parameter. Nu 
When the distribution of a statistic is known, the probability that an 
an be determined. Previous 


Obtained statistic departs from any fixed value c | А 
discussions in this chapter have exhibited several cases in which the 


fixed value was taken as zero. Thus, the reliability of a correlation co- 
efficient was discussed in terms of the likelihood of an observed r departing 
from ап? of zero. Also, the reliability of the difference between two 
means or between two r was discussed in terms of the probability of the 
Parameter value of the difference being greater than zero. 

The likelihood of the departure of an observed statistic from any hypo- 
thetical value can be tested, provided the sampling distribution of the 
Statistic around that value is known. The situation is simplest when the 
Statistic (or its transformation) divided by the appropriate standard error 
18 normally distributed. 

Consider the case in which it is wished to discover whether an obtained 
ғ Of .80 (based on 67 cases) differs significantly at the 5 percent level from 
an # of .70. Since Fisher’s z, transformation of r can be regarded as normally 


distributed with standard error of 1/s/ 'N — 3, z, equivalents for both r and 
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the hypothetical ? are found from Table Z, and the difference is tested by 
dividing by s... For ап ғ of .80, z, = 1.099, and for an ? of .70, z, = .867. 
Since a one-sided test is appropriate, it may be deduced from Table A that 
for significance at the 5 percent level, the ratio (z, — z,)/s,, must be 1.645 ог 
more. Since the obtained ratio (.232/.125) is 1.86, r may be regarded as 
significantly different from .70 at the 5 percent level. 

It may be noted that if the decision at the appropriate time (namely, in 
advance of computation) had been to use the 1 percent level of significance, 
a difference of 2.327 standard errors would have been required. According- 
ly, the obtained r of .80 would not have been regarded as significantly 
different from a hypothetical population value of .70. 


CONFIDENCE INTERVALS 


Another way of utilizing statistical information from a sample is to develop 
limits within which a given parameter is likely to be found. While for any 
sample the inference that the parameter is within these limits is either 
Correct or incorrect, an « of, say, .05 indicates confidence that the para- 
meters will fall within the established limits in 96 out of 100 similar samples 
drawn from the same population. In this way, two statistics of unequal 
value provide a joint estimate of an interval that is likely to include the 
parameter. 

The first step is to choose a “confidence coefficient” (which may be 
denoted as “conf.”) such as .95 or .99. “Conf.” is the complement of a, 
the probability that the parameter is outside the limits. Since two limits 
are involved, the procedure resembles a “two-sided” test of significance, 
and probability tables are entered accordingly. Let be the parameter 
corresponding to any normally distributed statistic and let U and L be the 


upper and lower confidence limits for the confidence coefficient of (1 — a). 
Then 


Conf. (U» д> = 1-а (13.17) 


which is a formal statement of the situation described above. Strictly 
Speaking, the inequality sign > in Formula 13.17 should be written > , 
indicating “equal to or greater than," but it is assumed here for purposes 
of simplifying the discussion that no parameter falls precisely at a limit. 
In a normal distribution, .025 of the cases are above + 1.966 and .025 are 


below — 1.960. Arbitrarily designating any normally distributed statistic 


as X and its standard error as 5ҙ» two equations must be solved to establish 
a confidence interval of .95: 


X-L U-X 
тле 1.96 апа = —1.96 (13.18) 


х 
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Similarly, .005 of the cases of a normally distributed variable are above 
2.586 and .005 below —2.58c. Thus the equations to be solved to establish 


U and L for a confidence coefficient of .99 are 


= aX X-L 
МС zn ani = +2.58 (13.18a) 


5% Sz 


This procedure applies to establishing confidence limits for statistics 
Such as the mean for large samples, r (through the z, transformation), and 
P, the proportion of cases falling in one of two categories. 

When the statistic, divided by its estimated standard error, follows the 
t distribution, the procedure is modified so as to employ values of +t 
Corresponding to the chosen « and the appropriate number of degrees of 
freedom. 

Example 13.5 gives numerical examples of establishing confidence limits 
for normally distributed statistics, and Example 13.6 illustrates the 
determination of confidence intervals for a mean based on a small number 
of cases. 


EXAMPLE 13.5 _ 


CONFIDENCE LIMITS FOR A NORMALLY DISTRIBUTED STATISTIC 

As noted in the text, if а parameter is normally distributed and known, the 
Standard error is the standard deviation of the statistic as computed in succes- 
Sive random samples with constant N. Generally, however, the parameter is 
Unavailable and the center of the distribution remains unknown. If the statistic 
'S known to be normally distributed and a good estimate of the standard error 
15 available, it is possible to establish confidence (or fiduciary) limits ne 
Which the parameter (which has a unique value) is, at à stated level of probability, 


likely to be fi 
ound. 
For an observed mean of 28.0 with an estimated sm of 2.1, are т 
fidence limits within which the parameter is likely to be found, with a probability 
9f .95? In Formula 13.18, upper and lower limits are designated as U and L. 
en 


7280 4.96 


28.0 L 196 and 
21 


1-239 and 0-324 
imits within which the probability 


Solving, 
. For the same date, what are the confidence 1 
18 .99 that the parameter will fall? By Formula 13.18a, 


280—L U —28.0 


2; 
Solving, L=226 апа О = 33.4 


22.58 


—2.58 and 
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A higher probability, of course, requires a wider confidence interval than would 
a lower probability, and this is reflected in the changes in the upper and lower 
limits. 


EXAMPLE 13.6 


CONFIDENCE LIMITS FOR A MEAN WHEN // IS SMALL 


If N is small, conventionally 30 or less, the ratio (M — p)/sm is not normally 
distributed, but rather follows the г distribution with (N — ау. Accordingly, 
Formulas 13.18 and 13.18a must be modified by finding from a table of the t 
function (such as Table T) the appropriate constants for the right-hand side of 
each equation. 

As in Example 13.5, let us consider an observed mean of 28.0 with estimated 
5м of 2.1. However, instead of M being based ona large number of observations, 
as was implied in Example 13.5, let N = 12. What now are the confidence limits 
for confidence coefficients of .95 and .99? 

In Table T we use the line for (V — 1) as 11df. A confidence coefficient of .95 
requires the points beyond which .05 of the distribution is likely to fall. In other 
Words, we need the value of t for a two-sided test when P = :05. In this case 
t — 2.201. 

Formula 13.18 now becomes 


280—L U —28.0 
——— = 2.201 and —— — = 2.201 
2;I 21 
Solving, L=23.4 and U=32.6 
Similarly, when df — 11 and Р = .01, t=3.106. By Formula 13.18a we have 
280—L U — 28.0 
ме 3.106 апа “> = 3.106 


which give the lower Confidence limits for a confidence coefficient of .99 as 21.5 
and the upper limit of 34.5, 


By the methods demonstrated in this example and in Example 13.5, it becomes 
Possible to specify with stated probability the band of values within which the 
unique value that is the parameter is likely to be found. 


It can be seen that confidence limits include the range of possible 
parameter values such that the hypothesis of no difference between each 
one of these values and X ; the observed statistic, is not violated at level о. 
For any given confidence coefficient, the range becomes narrower as N is 
Increased. Should the sample become coextensive with the population, the 
confidence interval would, of course, be reduced to a single point. 
SUMMARY 


Modern mathematical Statistics centers on the development of valid 


inferences about the population or inferences based on investigations in 
particular samples. 
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When samples are representative, probability theory provides a basis for 
the exact sampling distributions of various statistics, the development 
of the best possible estimates of parameters, and the use of testable 
hypotheses in utilizing specific observations as the basis for sound generali- 


zations. 


EXERCISES 


1. Below are ХХ and XY? for 50 samples of 25 cases, each of single-digit 
scores on a college level reading test. 


xx xx? nx xx: ХХ ХХ: ХХ ry: ХХ Xx? 
133 873 123 689 122 786 106 552 118 642 
125 693 133 785 110 626 118 660 123 691 
121 657 127 733 133 843 138 876 127 781 
133 819 124 706 104 536 100 522 117 641 
103 517 101 499 133 803 110 580 121 733 
122 716 112 548 105 501 121 649 110 622 
130 748 120 688 125 751 115 635 114 640 
114 610 103 543 18 682 12 572 128 736 
131 835 125 727 146 944 115 681 110 554 
139 877 132 844 136 802 125 711 115 603 
N = 1250 УХ = 6026 ХХ? = 34,462 


х Choose at random опе of the samples as the 
X and XX? estimate өм. Compare 5м SO estimat 


deviation of the 50 observed means. 


- Studying 32 octopus vulgaris lamarck, Muntz (5 
More attacks on a vertical rectangle than оп a 


“observed sample,” and from 
ed with the actual standard 


) reported that there were 
horizontal rectangle. With 


31df, t = 4.21. Evaluate the significance of the difference. 
(3) total times in seconds for solving 


Test the hypothesis that the older he: 
Dot differ significantly. 


! ү an appropriately selected гапйо 
E e two variables is observed to 
elieve that the parameter ғ is positive гаће 


Older hearing group 
Deaf group 


“Іп a study by Kates, Yudin, and Tiffany 
certain problems were as follows: 


Younger hearing group 


N M E 

30 11.83 6.58 
30 12.85 6.28 
30 10.32 7.13 


aring group and the deaf group do 


m sample of 145 cases, the correlation 
be .15. Test whether it is reasonable to 


r than zero or negative. 


356 AN INTRODUCTION TO PSYCHOLOGICAL STATISTICS 


5. Panton (6) gives the following data on the Pd (psychopathic deviate) scale of 
the Minnesota multiphasic personality inventory for nonhabitual and habitual 
criminals: 


NONHABITUAL HABITUAL 


CRIMINALS CRIMINALS 
Mean 65.9 157 
5 937 10.3 
N 50 50 


Test whether the difference in means is significantly different from zero. 


6. In one sample of 25 cases, the observed correlation between two variables 
was .40; in a second sample of 40 cases, it was .30. Can the difference in the 
r’s be attributed to chance ? 


7. For 37 cases at age 9, Goodenough (2) found a correlation of .728 between 
the Stanford Binet mental age and the Draw-a-Man mental age. What are 
the limits for this ғ for a confidence coefficient of .99? 


8. For 255 student pilots, Melton (4) reports that the validity of a practice trial 
on a complex coordination test was .24 compared with a validity on the 
first regular trial of .31. The correlation between the two predictors, practice 
trial and first regular trial, was .66. Is there a difference between the two r's? 
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ANALYSIS 
OF VARIANCE 


14 


Many of the results of educational and psychological experiments (as of 
experiments in other fields, such as agriculture, where the method was 
first applied) are reported in terms of analysis of variance. The term covers 
a large body of statistical knowledge and practice, originally organized by 
Fisher, and in recent years extended by other mathematical statisticians. 
The scope of this text permits only à short introduction to the topic. | 

In all its ramifications, analysis of variance involves logical application 
and extension of techniques already described. It employs means, deviations 
from means, variances, estimates of error, and in advanced work that 
includes the “analysis of covariance,” various reflections of correlation. 

Tom the point of view of the types of variables considered, this method is 
far more flexible than correlational analysis, since only the dependent 
Variable need be scaled in units. Theoretically, any number of independent 
Variables may be used, although, in practice, few studies use more than 
two or three and many use only one. Often the independent variables are 
Nominal scales or two or three degrees of some scaled variate. As the 
Number of independent variables and the number of their categories or 
degrees increase, the arithmetic of the analysis of variance tends to become 
Somewhat difficult to follow. Here, only the simplest cases are considered. 
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F DISTRIBUTION IN ANALYSIS OF VARIANCE 


The most distinctive characteristic of the analysis of variance is its use of the 
F distribution, discovered by Fisher in the early 1920s. As described in 
Chapter 12, each F distribution is a chance distribution of the ratio of two 
variances, each with its stated degrees of freedom and each based on 
values drawn at random from the same normally distributed population. 
There is a separate F distribution for each pair of v's, and the published 
tables, such as Tables F, and Е; (Appendix), generally give only two 
points for each distribution: the point exceeded by 5 percent of the 
ratios purely by chance and the point exceeded by 1 percent. 

The test of significance that is the core of analysis of variance is the 
“F test." It involves finding the ratio of two variances which, if a stated 
null hypothesis were true, would fall inside a specified point of the appro- 
priate F distribution. If this variance ratio falls beyond the 5 percent (or 
1 percent) point of the particular F distribution, determined by the degrees 
of freedom of the two variances concerned, then the null hypothesis is 
rejected at the 5 percent (or 1 percent) level of significance. Thus, in the 
analysis of variance, testing of hypotheses follows the same basic procedures 
as those discussed in Chapter 13, except for the use of a novel statistic (the 
variance ratio) and F, its distribution. 


NEW EMPHASES IN THE ANALYSIS OF VARIANCE 


In addition to the distinctive feature of the variance ratio, certain important 
emphases are found in discussions of analysis of variance, most of which 
have been discussed in preceding chapters. These include: 

1. The insistence that the primary objective of statistical analysis is to 
make inferences about the population—inferences that have general 
validity—rather than merely the description of a particular sample. 

2. The use of a precise mathematical model in the form of the exact 
distribution of a key statistic that would be expected by random sampling 
if a given null hypothesis were true. This involves careful attention to the 
degrees of freedom, since with the small samples that are characteristic of 
much experimental work, the shapes of certain chance distributions 
(notably + and F) change markedly with variations in the degrees of 
freedom. 

3. The development of precise hypotheses that can be tested by col- 
lecting observations. The null hypothesis can be developed in a wide 
variety of forms, but it is always exact, and potentially it can always be 
refuted (not absolutely, but at a given level of significance). Among its 
more useful forms are “no experimental effect” or “no difference among 


means,” but forms that include posited parameters other than zero can be 
readily established. 
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4. The use of fixed levels in judging significance. With the normal 
distribution, it is mechanically simple to evaluate the rarity of an event 
without regard to a fixed level, such as the 5 percent level or the 1 percent 
level. With F (as with г), complete tables would be cumbersome, since 
each change in the degrees of freedom involves a new distribution. The 
advantage of fixed levels, however, is not merely mechanical. By deciding 
on a fixed level of significance in advance, standards for rejecting null 
hypotheses become objective for the course of the experiment, and inter- 
pretations of outcomes are less likely to be influenced by subjective 
considerations. 

5. The use of methods to control extraneous variation that might 
Obscure results. These include eliminating variables and Sources of vari- 
ation, matching subjects to form equivalent samples, and using chance 
Procedures, particularly tables of random numbers, to make assignments 
to groups. 

6. cc designing of the experiment in advance, so that the 
maximum amount of useful information can be extracted from the 


Observations. 


STUDY OF DIFFERENCES AMONG MEANS 
In the analysis of variance the question that is always asked is whether 
the differences among a set of sample means of scores on the criterion or 
dependent variable are greater than what would be expected if the samples 
Were drawn from the same population. The samples, except for differences 
On the dependent variable (or variables) must be equivalent at the beginning 
of the study. If no differences are found among the means (at the chosen 
level of. significance), the null hypothesis is accepted; if the differences are 
Statistically significant, the null hypothesis is rejected. | 

Analysis of variance used only as a test of a null hypothesis results 
heither in the estimation of the magnitude of parameters nor in statements 
as to the degree of relationship between variables, and it is often a pre- 
liminary type of analysis. A study indicating an experimental effect of 
Some sort on a dependent variable might lead to a further study that would 
aim to obtain more precise information as to the effect of variation in the 
Independent variable. 

In the sense that it tests differences among any number of means 
Simultaneously, analysis of variance represents a generalization of the / test 
described in Chapter 13 for the difference between two means. 

As will be seen later, analysis of variance provides a convenient format 
for the arithmetic of one or more tests of significance from a set of data. 
Through its use, a mass of statistical data can be summarized in a for- 
Mat in which the presence of influences and trends may be readily appre- 
Clated. 
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TERMINOLOGY AND NOTATION 


SOME SPECIAL TERMS 


With the development of the analysis of variance as a statistical method, 
certain special terms and symbols have been introduced. Some will be 
noted here; more will be presented later. Since analysis of variance 
originally developed in the context of agricultural research, it includes 
terms from that source, notably “plots” and “treatments.” Other words 
and phrases have been modified from earlier statistical terminology. 

The phrase “sum of squares” always refers to the sum of squares of 
deviations, either from the grand mean of all the values or from submeans. 
A “sum of squares” divided by the appropriate number of degrees of 
freedom (the number of values less one) becomes a “ mean square,” or an 
estimate of population variance, here denoted at V. Thus x?/(N — 1) = Vex 

If the sum of squares or mean square is derived by finding (in effect, 
not necessarily directly) the deviation of each value from the mean of all 
values, it may be distinguished by the subscript г (for total). Similar values 
for “between groups” (or treatments) or for “within groups” (or treat- 
ments) may be designated with the subscripts b and w, respectively. 


ARRANGEMENT OF DATA 


Very often it is pertinent to analyze the values used in an analysis of 
variance in a rectangular, two-way classification, as shown in Table 14.1. 
In such an arrangement the number of rows may be denoted as r, the 
number of columns c, and these same letters may be used as subscripts 
wherever needed. At the intersection of any row and any column is a 
“cell.” 

A double subscript notation is convenient in identifying entries within 
a row and within a column. If the rows are designated as i = 1, 2... r and 
the columns j = 1, 2-- c, then any single cell value Y may be identified 
With a first subscript indicating row and a second subscript indicating 
column. (The order is universal, and comes from a convention in matrix 
algebra.) Thus X32 refers to the entry in the third row and second 
column. 

This Tectangular arrangement has various applications, some of which 
Tube es only in advanced work in analysis of variance. In 

` ©Olumns represent treatments, and the rows represent subjects 

or Observations, Table 14.1 shows four “treatments,” or “levels,” in the 
independent variable and ten observations within each treatment. While 
the term level may seem to imply some sort of ordering of the independent 
variable, such a connotation is not necessarily present. "Levels" may 
refer merely to the Several categories of a nominal variable. 
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The number of “treatments,” or “levels,” within an independent variable 
may be denoted as k; the number of observations (or subjects) within each 
treatment, as п. For a simple analysis of variance design, with a single 
independent variable (or “ factor”) and the same number of observations 
for each “treatment,” or "level," the total number of observations is 
kn, or М. Such is the case in Table 14.1, which shows k of 4, л of 10 and 


N of 40. 


TABLE 14.1. REPRESENTATION OF ORIGINAL DATA IN THE ANALYSIS OF VARIANCE 


In this table there are c =k =4 columns, each representing a group or “‘treat- 
ment” (that is, four “levels” or “categories”) of the independent variable. Within each 
group there are ten cases. (Equal-sized groups are not essential in analysis, but by 
Considering only equal-sized groups, the proofs presented in the text are simplified.) 


The number of rows is г =n = 10. However, in this particular design, the rows 


have no special significance because the № observations are independent and the order 

Within the column is arbitrary. 
Values of the dependent vari 

Tow; the second, the column. 


able are given as X. The first subscript denotes the 


GROUP 
CASE WITHIN 
GROUP 1 2 3 4 
1 Xu X12 Хіз Xu 
2 Хә X22 X23 Хәл 
3 Xn X32 X33 X34 
4 Xn Хаг Хаз Хад 
5 Ху X52 X53 X54 
6 Xo Хез fos Хва 
7 (71 X72 X73 Xna 
8 Ха a2 Хз Хва 
9 Xo Xoz Xo3 Xa 
10 X10,1 X10,2 X10,3 X10,4 


A set of observations, one for each treatment (or combinations of 
treatments when there are two or more independent variables) may be 
Called a replication. Since in Table 14.1 there are ten observations for each 
“treatment,” the number of replications is ten. Occasionally, a complex 
Study with several “factors” or independent variables includes but a single 


Teplication; in general, however, the number of replications is con- 


Siderably greater than one. 
A “factorial” study is one with two or more “factors” or independent 


variables. The number of levels or categories within each is indicated 
Separately. Thus, a 4 x 3 x 3 factorial experiment is one with three 
factors or independent variables; the first has four levels or categories 
and the others have three each. In describing a factorial study, the degree 
of replications is stated. A 4 x 3 x 3 experiment with five replications 


Would require a total of 180 observations. 
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THE VARIANCE RATIO 
The analysis of variance is based on the following principles: 


1. The ratio of two variances, each calculated from independent samples 
from the same normally distributed population, follows the particular 
F distribution for the degrees of freedom of the numerator variance 
and the degrees of freedom of the denominator variance. 

2. When a sample is divided into groups of some sort, two uncorrelated 
estimates of the population variance can be obtained; the first is 
estimated from the variability of the means of the groups and the 
second from the variability within the several groups. 

3. If the division of the sample into groups is merely at random, then the 
expected distribution ofthe variance ratio is F, with the appropriate 
pair of df. This follows from the fact that F, for any pair of df, is the 
expected distribution of an indefinitely large aggregation of ratios of 
independent variances obtained by random sampling from the same 
population. 

4. The null hypothesis is that the division of the sample into groups is at 

random (more precisely, the null hypothesis is that there is no difference 

in the populations from which the two variances have been formed). 

When a variance ratio exceeds the chosen significance point (5 or 

1 percent), the null hypothesis is rejected. It now seems likely that the 

division of the sample into groups was not at random. 

6. In Setting up a study, grouping can be by the categories of an indepen- 
dent variable, which may be any type of a scale (nominal, ordinal, 
Interval, or ratio). The quantitative variable from which two estimates 
of the population variance are obtained is the dependent variable. 

7. If the null hypothesis is rejected, the presence of a relationship between 
the variable represented by the grouping and the dependent variable 
becomes plausible. 


DIVISION OF A SUM OF SQUARES INTO TWO COMPONENTS 


To demonstrate that, when a total group consists of subgroups, the total 
Sum of squares may be considered to have two sources, one may start with 
simple numerical examples. 


Consider a total distribution of nine values, ranging from 3 to 5, with 
three cases at each step. Then: 


X E $ fe 
5 3 + 3 
4 3 0 0 
3 3 3 


N=9; M; —4; Xx — 6. eu 
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Now consider the same cases divided into three groups: A, B, and C: 


GROUP A GROUP B GROUP C 
X f f f 
5 1 1 1 
4 1 1 1 
3 1 1 1 


For each group it is readily seen that Хх? = 2, and the sum of these 
Хх2% for three groups is 6, just as for the total sum of squares. However, 
this is a very unusual case in that M4 = Mg = Mc = M,= 4. Because 
the three group means are identical, the “between groups sum of squares" 
is zero. Therefore V,, the “between groups variance," is zero. V,, the 
“within groups variance,” is obviously a positive number, while the F ratio, 
РЫУ, is .00. 

Now consider a different division of the cases into three groups: 


GROUP A GROUP B GROUP C 
X f F f 
5 0 0 3 
4 0 3 0 
3 3 0 0 


Here it is apparent that there is no variability at all within any group and 
that the “within group sum of squares” is zero. Therefore V, is zero. 

The means, however, are 3, 4, and 5, respectively. They exhibit vari- 
ability, and a variance of some magnitude could be computed for them. 

Again, this is a special and atypical situation, but with V, positive and 
V, zero, the F ratio V/V, is infinitely large. 

In the discussion of the F distribution in Chapter 12, it was noted that 
the F ratio varies from 0 to оо. Examples of the two extremes have been 
Biven. With real data, an F intermediate between 0 and oo can be antici- 


pated. 


ALGEBRAIC SOLUTION 
It will now be shown algebraically how, when data are arranged in groups, 
the total sum of squares can be uniquely and precisely divided into two 
Components, the “between groups sum of squares" and the “within 
Sroups sum of squares." From these two sums of squares, a variance ratio 
may be constructed and compared with the appropriate F distribution. 
Let the subscript / identify statistics of the total sample; w, statistics 
Within groups; and 5, statistics of the means (that is, “between groups”). 
There are k groups, each with л cases. When groups are of the same size, 


the total number of cases is kn or N. 
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With any deviation, that is, any (X — M,) designated as x, its square is 


XXX). (ХХ)? 
ү Et ux 


x? = X?_2M,X +M} - X? — 


Summing and writing equivalents, 


XXX (ex)? 1 МУХ? — (УХ)? 
к Bre) doc np M NM 
Ex = DX N 4 N УХ N (ZX) N 
Specifically, the total sum of squares is 
N N 1/5 М2 — NEX?-(zxy 
Хх хе (Ух) СЕЕ су (14.1) 
I 1 NT N 


Similarly, within any group the sum of squares is 


п п Life 2 ndX,,* (XX. 
m x, -:( х.) ee News 


14.2 
mE (14.2) 


Summing Formula 14.2 over the k groups and using double summation 
signs to indicate that the sums have been summed, 


Hs-fe-ibo-HQs] Dex) 


k n N 
The expression Y DX, is precisely У X/^, the sum of the squares of the 
1 1 1 
original values. 
N 
Substitution in Formula 14.3 of the equivalent for Y X. 
1 
N 2 
22, 00%) 
2 
У х, + N 


found by rearranging terms in Formula 14.1, yields 


N N 1 /N 2 15/25 2 
Ух? =) x? + (> x) = »(x x.) (14.4) 
1 1 N\T п т\т 

. It will now be shown that the last two terms in Formula 14.4 are pre- 
cisely equal to —x,?. the **between groups sum of squares,” or “treat- 


ment sum of squares," which is n times the sum of the squares of the 
deviations of the k group means from the total mean. Indicating the 
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$ 12 
difference between any group mean С х.) and the total mean (5 x ) 
n^i N^ ' 


as d, 


Squaring, 


Lon 2 2 
4%=—(у X.) - X 
n? (% ) nN 


Summing over the k cases, 


yq 1 k/a 2 2 N kon k үх 2 
--ҙ X,) -— 2} X X,)+— | Х, 14.5 
1 "24 È ) uN (23 ) nkN |» ) аз) 
N 
7 Canceling k out of the final term, substituting the equivalent 2» X, for 
n T 
4. È X,,, consolidating the last two terms, and multiplying both sides of 
Formula 14.5 by n, 
k A 2 178 2 
пу d= Xx, em (È х.) - 5; x) (14.6) 
T nT \t N\A 


Substituting Formula 14.6 in Formula 14.4 yields the fundamental 
equation: 


Ex? = Ex,? — Ex,” (14.7) 
or 

Ex, = Хх? — Ex (14.7a) 
or 

Ex? = Ех„? Ix (14.7b) 


That is to say, the total sum of squares is equal to the within-groups sum 
of squares plus the between-groups sum of squares. 

By methods beyond the scope of this text, these two sums of squares, 
Zx,? and Хх,2, can be shown to be independent. 

When a sum of squares is divided by the appropriate number of degrees 
of freedom, the result is a “ mean square." Ву this procedure, Ex,? and Ex? 
Will yield two independent estimates of the population variance. 

If the variation among the group means is no more than would be 
expected by random sampling, then а long series of variance ratios, V,/V,,, 
Would be expected to follow the appropriate F distribution. However, if an 
Observed variance ratio Va| Vw is so large that it is significant at the pre- 
determined level (5 or 1 percent), then the null hypothesis is rejected and the 
conclusion is reached that the differences among the means are greater than 
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could be expected by chance (always provided the assumptions underlying 
the F test have been met). 

This use of F provides a way of detecting nonchance differences when a 
number of means are concerned, as contrasted with the ¢ test, which is 
applicable to only two means at a time. 

In a simple analysis of variance, with a single independent (or treatment) 
variable distinguishing groups 1, 2---k, the null hypothesis is that 
Ш = =" = р, = ш. A significant variance ratio does not in itself 
indicate whether the means are generally different one from another or 
Whether only one or two of the means differ from the others. Such a 
finding would require further investigation. Neither does a significant F 
indicate what type of relationship (such as linear, or nonlinear) might exist 
between the independent variable, on which the groups were formed, and 
the dependent variable. Nor does a highly significant F necessarily reflect a 
Stronger degree of relationship between the independent and dependent 
variable than would a ratio less highly significant. Description of the type 
of relationship and measurement of its degree would require fitting a 
function (such as a straight line, as in correlation analysis, or some sort of 
a curve) and measuring the fit, as by the magnitude of the sum of the 
Squares of the errors. In a sense, analysis of variance is often a preliminary 
technique aimed at detecting evidence for any type of relationship other 
than that expected by sampling variation. 


DEGREES OF FREEDOM IN ANALYSIS OF VARIANCE 


Like the sums of Squares, the degrees of freedom in analysis of variance 
are additive. For the simplest design of k groups of n cases each and a 
single independent variable, the total number of df is (N — 1), N being the 
total number of cases, For the “between groups” sum of squares there are 
(k — 1) degrees of freedom, one less than the number of groups. For the 
“within groups” sum of squares, the df аге (N — 1) —(k—1) = N — k. 

After the “between groups” sum of squares has been divided by 
(k — 1) to form V, and the “within groups" sum of squares has been 
divided by (N — k) to form Vw, the variance ratio F is found by dividing 
V, by V... 

The degrees of freedom have their usual meaning. In forming any sum 
of squares, ldf is lost for each different constraint imposed. The number 
of df is always the number of varying values from which the sum of the 
Squares of deviations is formed, less the number of means used in forming 
the deviations. For the total sum of squares, there are N values and a 
single mean; hence (N — 1)df. For the “between groups" sum of squares, 
there are k varying means and a total mean from which the deviations are 
taken; hence (k — 1)df. For the “within groups" sum of squares, there are 
N values, each used in reference to the mean of the group in which it 
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appears. Since there аге k such means, there are (№ — k)df. The “between 
groups" df, (k — 1), and the “within groups" df, (N — k), sum to the total 
df, (N — 1). 

In the more complicated analysis of variance designs, with two or more 
independent variables and with terms representing single and combined 
effects of such variables, the determination of df for the several sums of 
Squares may seem more complicated, but the basic principle is always the 
same. The first step is to count the number of deviations that have, in 
effect, been squared; the second step is to subtract the number of means 
used in forming them. The result is the number of df to be used in com- 
puting the “mean square” or variance estimate. 

In complex designs, the df for the total sum of squares may be sub- 
divided in alternate fashions so that the various sums of squares indicated 
in the analysis have more than (N — 1)df. However, in all cases, the total 
df equals the sum of the df of one subset of independent sums of squares. 


ANALYSIS WITH A SINGLE INDEPENDENT VARIABLE 


The simplest analysis-of-variance designs involve a single independent 
variable, Example 14.1 supposes that the subjects ofa psychological study 
have been assigned at random to treatment groups and that the variances 
Within each of the treatment groups may be taken as equal (within the 
limits of sampling error). f 

The investigator is interested in determining whether the independent 
variable, which in this case might, for example, be teaching methods, can 
be considered to have a relationship to the outcome of teaching (achieve- 
ment in the subject matter), the dependent variable. It is to be noted that ifa 
Study involves several qualitatively different treatments, the independent 
Variable necessarily consists simply of nominal categories. - | 

The hypothesis to be tested, the null hypothesis, is that, within sampling 
error, шу = u, = ду = piu; that is to say, the differences between pairs of 
Means can be accounted for by sampling rather than by an experimental 
effect resulting from differential treatment of the groups. Unless the groups 
Can be considered at the beginning of the study to be at the same level of the 
dependent variable, and equally susceptible to the treatment, results of the 
experiment will be ambiguous or even misleading. In this study, it is 
assumed that controls of extraneous variation that may affect the outcome 
have been adequate. : 

The arithmetic is simple. To adapt Formula 14.6 for computing purposes 
So that the n’s of the groups need not be equal, it may be written as 


А (5 ы түм \2 
wen a Ee x) (14.62) 


1 
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which requires that each group be summed on the dependent variable and 
that the square of this sum be divided by the number of cases in the group. 
From the sum of these quotients a correction term, (ZX)?/N, is subtracted, 
in which ZX is the total sum of the dependent variable and N is the total 
number of cases. 

The total sum of squares is found by Formula 14.1 and the within- 
groups sum of squares by Formula 14.3. The three sums of squares may 
be checked by Formula 14.7. 

Provided the same step interval and arbitrary origin are used for the 
total group and for all the subgroups, Formulas 14.1, 14.3, and 14.6a can 
be used with only minor modifications for computations from frequency 
distributions. In this case, each X would be read as x’ and each Y? as x, 
or as deviations and squares in terms of step intervals from an arbitrary 
origin. To form approximately true values for the sums of squares, final 
results would have to be multiplied by i?, the square of the step interval. 
However, this correction would cancel out in the variance ratio. 

The three sums of Squares, corresponding degrees of freedom, mean 
Squares or variances, and the F ratio may be tabulated as shown in the 
accompanying table. 


MODEL FOR A SIMPLE ANALYSIS OF VARIANCE 


MEAN SQUARE, 


SUM OF SQUARES df ов V F 
Between samples (or treatments) k—1 Vo Vol Vo 
Within samples (or treatments) N—k Vo 

TOTAL N-1 


The tabulation may also show the size of F required for significance at 
the preselected level (а = .05) or (x = .01). 

The use of this format is illustrated in Example 14.1, which, having a 
"randomized group design," illustrates the simplest form of the analysis 
of variance. Here, the total sum of squares and the related degrees of 
freedom are divided so that two independent mean squares are found. 
Since there are only two independent mean squares, the between-groups 
variance and the within-groups variance, only a single F test is possible. 


EXAMPLE 14.1 


SIMPLE ANALYSIS OF VARIANCE 


Purpose. The simplest type of analysis of variance (sometimes called the 
randomized group design) aims to test the hypothesis that there are no differences 
among the group means; that is, that each differs from the others no more than 
would be anticipated on the basis of chance (at a stated level of confidence) if 
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иы sample were drawn at random from the same normally distributed popula- 
lon. In effect, the method of the 7 test is generalized to take care of more than 


two means. 
In the present instance, we wish to test whether the four group means in Table 


14.2 differ significantly among themselves. Arbitrarily we decide to reject the 
null hypothesis if the F test is significant at the .05 level. 


TABLE 14.2, DATA FOR SIMPLE ANALYSIS OF VARIANCE 


(Four grou i 
О Ds, ten cases in each group; k = 4, п = 10, № = 40. Samples represent ls 
lation known to be normally distributed.) HOP 


GROUP 1 GROUP 2 GROUP 3 GROUP 4 
1 43 24 27 

37 39 30 20 

44 21 17 25 

28 32 34 13 

42 21 32 32 

23 14 35 4 

45 24 19 3 

32 24 6 36 

37 26 30 28 

45 26 19 21 

Computation of the Between-Groups Sum of Squares 

Means: 34.0 27.0 24.6 20.9 
М.- Mi: +7.375 +.375 —2.025 —5.725 


Mi = 26.625 X(Me— Mi = 91.4075 пХ(М.- M)? = 914.075 = Xx? 


By Formula 14.6a, 


n 
Жуз. (Xe)? (xa? (340)?  Q70? (246:  Q09? (1065): 
Xa дЫ e (246)? (209) (1065* | 

> ы с oto * 10 * 10 z0 = 914075 


Method. Numerical computations are straightforward. 
n n 

l. For each group find n, УХ and >.X*. For the data of Table 14.2, these 
1 1 


у 
alues (together with the totals) are: 


GROUP n xx Ix 
1 1 
1 10 340 12,874 
2 o 270 7,976 
3 a 246 6,828 
4 10 209 5493 
5 N N 
TOTAL N=40 >X = 1,065 DX? = 33,171 
1 1 
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n 
2. For each group find 5 x?, that is, the sum of the squares of the deviations 
1 
from the group mean. A convenient computing formula, applicable to sums of 
squares generally, is 
yo МЕХ — (Е) 
Р ж 


For the four groups of Table 14.2, these sums are 


Exa? = 1124.9 


3. Sum these sums of squares to find the within-groups sum of squares; 
that is, 


kn 
2x8 = Уха? = 1314 + 686 + 776.4 + 1124.9 = 3901.3 
гт 
4. For all groups, sum the т to find N, the total number of cases; that is, 


k 
От =N 
1 

Tf all m are identical, kn = N. 


n 
For all groups, sum the УХ to find the grand sum of all the values; that is, 
1 


These values, М, XY, and УХ 2, are the totals shown in step 1. 
5. Compute Xx as follows: 


хе МХХ? (ЕХ): 40 х 33,171 — (1065) 
à N 40 


6. Find the between-groups sum of squares by subtracting the within-groups 
Sum of squares from the total sum of squares, as indicated by Formula 14.7a. 


Ex = Ух — Zx? = 4815.375 — 3901.3 = 914.075 


— 4815.375 


7. Divide Xx by the appropriate degrees of freedom (k — 1), in this case, 
3df, to form the between-groups variance; divide Dx»? by (№ — k) or, in this 
case, 36df, to form the within-groups variance. F is the ratio of these two variance 
estimates. 


Final computations are shown in the table. 
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SUM OF MEAN SQUARE, 
SQUARES df OR VARIANCE F 
Between groups j 914.1 3 304.7 2.81 
Within groups 3901.3 36 1084 
TOTAL 4815.4 39 


Reference to Table Fs indicates that with 3df in the numerator, 36df for the 
denominator, an F of 2.81 is very nearly significant at the 5 percent level. 

Actually, the result is slightly surprising. As the source of the data, four samples 
of ten values each were drawn at random from the Rand table of random normal 
deviates (6) and converted linearly to two-digit positive numbers. These are the 
four groups analyzed. Only in 5 percent of such samplings can F be expected 
to be significant at the .05 level. . 

Formation of the variances is actually an optional step, since the F ratio can 
be readily found from the sums of squares and the two values of v, the degrees 


of freedom. In this case, 


= Yel _ 36 x 914.1 —281 
уха? 3 x 3901.3 


in which vw is the number of df associated with the within-groups sum of squares 
and v, is the number of df for the between-groups sums of squares. 

Check on Between-Groups Sum of Squares. At the bottom of Table 14.2, the 
between-groups sum of squares is checked by working with the group means and 
the total mean. The difference between each group mean and the total mean is 
Squared. These deviations are summed and multiplied by л, the number of cases 
Within each group. The result, 914.075, is Xx». If the groups were of unequal 
size, it would be necessary to multiply each squared difference by the т of the 
group, and then sum these. An even simpler method of finding Ex? is by 
Formula 14.6a. Its use is shown at the bottom of Table 14.2. 


An F test always tests whether two variances differ significantly. The 
Numerator mean square always relates directly to the one hypothesis 
under consideration; the denominator variance reflects sampling variation. 
Terms applied to a denominator variance include within-treatments mean 
Square, residual mean square, and remainder mean square, or simply, the 
error term. 

A significant F is one that shows that the variance related to some 
hypothesis is greater than could be expected simply by sampling. Typically 
in analysis of variance, one or more sources of extraneous variation are 
Controlled or eliminated. If group means are identical within sampling 
error, then there is no variability that can be considered to be related to the 
hypothesis under study. 

The more advanced designs, which for the most part are beyond the 
Scope of this text, involve dividing the total sum of squares (and related 
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degrees of freedom) into more than two components, and applying the F 
test to one or more pairs of mean squares. Always the denominator 
variance is an “error term,” but in some designs two error terms may be 
found from the same set of observations. Also, as the number of indepen- 
dent variables increases, the number of possible F tests increases, both for 
the effect of each independent variable with variation in the other indepen- 
dent variable eliminated and for combinations of independent variables. 
“Factorial design” involves the sorting out of effects of two or more 
independent variables in the same study. 

Certain relationships of the sums of squares used in the analysis of 
variance to r, the product-moment coefficient of correlation, and n, the 
Correlation ratio, are demonstrated in Example 14.2. 


EXAMPLE 
SOME RELATIONSHIPS OF THE SUMS OF SQUARES TO r AND n 


Purpose. The descriptive statistics, r and 7, involve sums of squares of devia- 
tions, and it is pertinent to inquire as to relationships between them and certain 
sums of squares. 

The Data. In Table 14.3 are exhibited six artificial scatter plots, constructed 
to bring out the relationships. In these diagrams, the independent variable X, 
is scaled, which is a requirement for r, but not for the analysis of variance. While 
analysis of variance is sometimes performed on scatter plots, it would ordinarily 
involve fewer categories in the independent variable or much greater numbers 
Of cases than 27. 

The Dependent Variable. In each of the six diagrams, the distribution of the 
dependent variable Х is exactly the same. It is for this variable that the sum of 
the squares of the deviations from the mean (Xx?) is separated, in a simple 
analysis of variance, into two portions: the between-groups sum of squares, 
Хх, and the within-groups sum of squares, Xx,?. It can be readily ascertained 
that N=27, ХХ,--54, and XX = 144. Accordingly, Ex, —36, while V, 

= 36/27, or 1.33 (Formulas 14.1 and 5.35). 

Computation of r. Arithmetical methods demonstrated in Chapter 6 were 
used to find the correlation coefficient reported at the upper left of each diagram. 

Computation of y. In Chapter 9, ң?, the square of the correlation ratio, was 
defined as the ratio of the variance of the means of the arrays to the total variance. 
The two 106 are thus Nuz = Sm,/Sy and ху =5т„[$з. In this example, X, is always 
the dependent variable, Accordingly, Ji? = Vm,/Vo. However, if both numerator 
and denominator are multiplied by №, 701” becomes, with changes in notation, 


Ух хі; that is, the ratio of the between-groups sum of squares to the total 


sum of squares. 
The F test is based on the same information, namely, Хх? апа Xx,? (or 


Xx? —Dxe?) plus knowledge of the df associated with each of the sums of 
squares. 


14.2 
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TABLE 14.3. SCATTER PLOTS WITH r?, 2, AND SUMS OF SQUARES 


l. r? = 1.00; 7012 = 1.00 2. r?=.52; 7012 = 1.00 
Xi X 
X» ш.-ъ Ио ее 2.9755 732786497 5938 
4 3 4 3 
3 6 3 6 
2 9 2 9 
1 6 1 peser rc] ЕН ШР ИЗ 
9 үз о [3 
n n 
Ухо 6 18 18 12 ОТВ 6 32 18 
1 
X з 36 з 6 3 п 595 $4 16 
Exo? = 36; Exu? =0; Exi? = 36 Exp? = 36; Exu? = 0; Ex? = 36 
3, p= 67; 001? = .67 4. r3—.15; 7012 = .31 
Х\ x 
ЖО Догу ж cu SO Or Шо 92 042977 26 
a ЕМЕНІ TEJE 
ee ал 2 | 2 | 2 Зу ЕШ ЖЕРЛЕ. 
BEDE: 2 Ж |23 
1 eoe ЖЕЖ ШЕ 
Б АМИ ЕТУ oat ПІ oder БЕТИ 


nt ІС Са c gl Жо al ad Ou vul 


Exo? = 24; Exu? = 12; Ex? = 36 Exo? = 11.1; Exu? = 24.9; Zx? = 36 
5. r? = 00; go? = .00 6. rt—.00; 701° = -00 

x Pf 
Хы бо оч эу ж 0 WU ow зк э 0 
a tay a 1 7 ПИЙ! 
3 AEA 3 ПЕ Ык ые жаа 
2 Сай (кй СШТШ Б Б ЕИ @ es E EG e 
ПМ) eem ЕЙ БЕСЕ ЕГ, 1 gj ЕЖЕ И d 
Or ЕЕ БЕТІ ЕН 0 I| t| 


w КЕ ШЫЖЕ ДЕ Же АК хы Әр он 5 4582 527471 


Exo? = 0; Xy, = 36; Ex — 36 Exe? = 0; Exu? = 36; Xx = 36 
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n 
Below each diagram the У Xo of each of the arrays is shown. Each of these 


1 
has been squared and divided by its л. From the sum of these quotients, (2X,)2/N 
is subtracted to find Xx;?. In these diagrams (£X,)?/N is always 108. Each 
oi? is simply Хх / Хх, 

It will be remembered that y fits no function and is not dependent on the 
structure of the independent variable. Because there is no within-groups variance 
in diagrams 1 and 2, both values of yo? are 1.00 (as are the values of "107, 
which аге not shown). 

In diagrams 5 and 6, both r? and ro? are .00. However, it can be seen that in 
diagram 5, 10? would be high, while in diagram б, |107 would be .00 because in 
neither direction do the means of the arrays have variability. 

Relationship of r. By Formula 6.7a, in terms of original values, 


Vo = Vo + Vo 


That is, the total variance is divisible into the predicted and the unpredicted 
variance. The predicted variance Vo is ro” Vo, and the residual variance Vo. 
is (1 — ғо) Vo. 
Using the data of diagram 3, it will be shown that NP» is Ухь?, the between- 
groups sum of squares, and that NVo.1 is Хх, the within-groups sum of squares. 
If N = 27, Vo = 4/3 (or 1.33), and r? = 2/3 (or .67), then 


МЮо-27хіхі-24-Ххы 
апа 


Мол =27 (1 — $)$ 27x 4$ x $ = 12 = Ex? 


These relationships, however, do not hold in diagrams 2 and 4, where r’s 
are nonzero, but the regression is not linear. It is only in the case that both 
independent and dependent variables are in units suitable for correlation and 
the regression is strictly linear that r = V Xxx*/Ex =). 

The result is not surprising in that r? can be expressed as Йо] о, in which f^; 
is the variability of the points exactly on the regression line. In the special case 
of strictly linear regression, NVo would therefore equal Xx? precisely. 

Note: In all diagrams, E Xo = 54, X Хо? = 144, and Ххо? = 36. The following 
information, obtained from the diagrams, was used in finding values of r?: 


DIAGRAM XX; zx? Xx? ххх Ххохі 
1 54 144 36 144 36 
2 54 156 48 138 30 
3 81 297 54 198 36 
4 99 423 60 216 18 
5 87 357 76.7 174 0 
6 81 295 52 162 0 


It must be Temembered, however, that the questions posed in the 
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analysis of variance and in statistics descriptive of relationships are quite 
different. In effect, the purpose of a simple analysis of variance is merely to 
determine whether, at a stated level of significance, an observed relation- 
ship between an independent and a dependent variable is greater than 
сап be ascribed to chance. The degree of significance attained by an F ratio 
is a function of the differences among the means, of the number of the 
means, and of the total number of cases. On the other hand, у provides а 
measure of the fit of the data to some unspecified function, while the 
correlation coefficient reflects the degree to which the values of a structured 
independent variable fall along a straight line. 


BARTLETT'S TEST OF HOMOGENEITY OF VARIANCE 


In addition to the assumption that cases are drawn randomly and indepen- 
dently from a normally distributed population (or rather from k normally 
distributed populations if the k groups are considered to have differing 
origins), the assumption is made that the variances of the k populations 
are equal. 

Actually, the F test is not greatly affected by departures from normality 
of the distribution of the dependent variable, and (particularly when n's 
are equal) it is not greatly affected by inequalities among the group 
variances. However, Bartlett's test of the homogeneity of variance pro- 
Vides a method of determining at a preselected level of significance whether 
Variation of the group variances can be attributed to chance. Its use is 


Shown in Example 14.3. 


EXAMPLE 14.3 


BARTLETT'S TEST OF HOMOGENEITY OF VARIANCE 


Purpose. When analysis of variance is applied to testing the differences among 
k independent means, two assumptions are made: 
1. That the л, values of the dependent variable in each group have been 


drawn from a normally distributed population; and к. | 
2. That the k populations represented by the k samples have identical variances, 


A test developed by Bartlett (1) uses as the null hypotheses: А = Из = Из = 
+ = Ve. A quantity 


ГОЮ) log; V, — X(— 1 10810 V2] 


is evaluated as X? with (k — 1) degrees of freedom and with a high value of y? 
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indicating lack of homogeneity. The value of C is found! from the formula 


1 1 1 
H 1255 xxl 


Data. The data are from Example 14.1, where all values were drawn at random 
from a single normally distributed population. The F test for differences among 
the means was, however, significant at the .05 level. 

Computation of C. In Example 14.1, k — 4, since there are four groups; 7 is 
10 for all groups; and N — 40. Accordingly, 


DECIPI NO 
пашаже гш S 


Further Computations. Using the several sums of squares and corresponding 
degrees of freedom from Example 14.1, the following estimated variances have 
been found. Corresponding logarithms have been obtained from a common log 
table (that is, logarithms to the base 10, not to the base e). 


SUM OF SQUARES df E logio V (1-1) logio Vi 
2x12 = 1314 9 146 2.16435 19.4792 
Ухо? = 686 9 76.2 1.88195 16.9376 
Ex = 776.4 9 86.3 1.93601 17.4241 
Уха? = 1124.9 9 125 2.09691 18.8722 
Х(п-1) logio Vi = 72.7131 
Ex? = 3901.3 36 108.4 2.0305 (N—k) logio Vw —36(2.0305) 
= 73.0980 


Accordingly, the quantity to be evaluated as x? with 3df is 


2.3026 
73.0980 — 72.7131) = .85 
1.046 ; ! 


From Table C it is found that a у? of .85 with 3df is not significant at the .05 
level. The null hypothesis is not disproved and the variance can be considered 
equal in the populations represented by the samples. 

What to Do When Variances Are Heterogeneous. When variances as tested 
Seem not to be homogeneous, the F test (if applied) needs to be interpreted more 
stringently, since the effect of heterogeneity is to raise the value of F. A more 
adequate Procedure, which involves adjustment of the scales of measurement, 
15 Suggested in advanced texts in experimental design. 


----- С 


! Here C is just a term to simplify the expression yielding the value distributed as chi 


square. This C, of course, has no relationship either to C, the contingency coefficient, 
or C, the covariance 
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COMPLEX DESIGNS USING ANALYSIS OF VARIANCE 


Discussion of the more complex analysis of variance patterns is beyond 
the scope of this text. Treatments of the logic as well as numerical examples 
showing computing routines applicable to psychological and educational 
data are given by Edwards (2), Lindquist (4), McNemar (5), Ray (7), 
Walker and Lev (8), and Winer (9). Topics include: 


І. Methods of handling two or more independent variables; 

2. Techniques of handling correlated observations; 

3. Methods of control of sources of variance not germane to the problem 
of interest ; 

4. Techniques for reducing error; and 

5. Techniques for increasing efficiency in the use of data. 


Principles carried over from simple analysis of variance to the more 
complex designs include: 


1. Splitting the total sum of squares of the dependent variable into com- 
ponents (but with the number of components increased to three or four 
or more); | | 

2. Breaking up the total df into the df associated with each of the sums of 

squares; 

‚ Computation of variance ratios; and x 

4. Evaluation of each variance ratio by means of the appropriate F 


distribution. A significant F ratio is taken as indicative of nonchance 
variation among a set of means, which, in advanced designs, may not 


actually be computed. 


[m 


Novel elements in advanced designs include: 


l. The development of precisely described mathematical models of com- 
plex experiments; Е т 

2. Conidumóon of the joint effects of two or more independent variables; 
and А 

3. New uses of continuously scaled variables, as an independent variable 
in “trend analysis” or as a control variable in the analysis of co- 
variance.” 


Often the advanced designs have descriptive names and distinctive 
computing techniques, and have their justification in the particular con- 
Stellation of observed variables from which an answer Is sought. More 
importantly, however, consideration of the design prior to the experiment 
Often leads to a happy choice of sampling procedures, methods of oper- 
ation, and statistical treatment such that the results of the Study are far 
More useful than would otherwise be the case. 
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A TWO-WAY ANALYSIS OF VARIANCE 


Example 14.4 comprises a two-way analysis of variance, again using arti- 
ficial data drawn from a normally distributed population. As in Example 
14.1, any F ratio would be expected to be insignificant, and here in fact 
no F ratio attains significance at the .05 level. 


EXAMPLE 14.4 


A TWO-WAY ANALYSIS OF VARIANCE 


Purpose. In investigations employing analysis of variance, it is possible to use 
two or more independent variables. Since the present example is based on 100 
values drawn at random (but rounded to eliminate fractions and adjusted 
linearly so as to avoid negative values) from the Rand table of 100,000 normal 
deviates (6), nothing beyond chance variation should appear among the means 
of the arrays, 

The Problem. The four rows in Table 14.4 can be considered to represent four 
“levels,” or categories, of any “treatment” variable; say, four different methods 
of instruction. The five columns can be considered to represent five “levels,” 
or categories, of any other “treatment” variable; say, five different tutors. We 
can suppose that all five tutors are skilled in using the four different teaching 
methods and that each used method with five randomly assigned students who 
had no initial knowledge of the subject matter. It can also be assumed that other 
sources of variation, such as intelligence, reading speed, and interest, have been 
adequately controlled. 

In each of the cells ог“ plots" of Table 14.4 are five values of the dependent 
variable. These values may be considered final test scores representing knowledge 
of the subject matter for five individuals after a stated number of hours of train- 
ing and study under the prescribed method of instruction, as directed by the 
assigned tutor. 


Questions that may be asked of the data are: 


1. At a stated level of significance (say, .05), are there reliable differences 
among tutors, as measured by the attainment of their students? 

2. At a stated level of significance (say, .05), are there reliable differences 
among methods? 

3. Is there significant “interaction” between tutors and methods; that is, do 
one or more tutors attain reliably better results with one or more methods 
than do other tutors with other methods? 


Analysis of a Portion of the Data. In actual research, of course, one would 
attack the main Problem, using all available data. However, to show certain 
relationships between simple analysis of variance and analysis with two inde- 
pendent variables, Table 14,5 is presented. As in Table 14.4, rows can be taken 
to represent "methods" and columns can represent “tutors.” However, there 
is only a single entry in each cell, the total number of cases being 20. Actually, 


the first observation in each of the cells of Table 14.4 is given as the only observa- 
tion in each of the cells of Table 14.5. 
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TABLE 14.4. DATA FOR A TWO-WAY ANALYSIS OF VARIANCE 
(100 values, drawn at random from a ni i i i 

ormal population: simulating a study with foi 
Tows, five columns, five replications) У 


COLUMNS 4, 
ROWS 1 2 3 4 5 УУХ 
1 46 19 27 18 30 


31 38 33 29 37 749 


25 28 39 20 42 725 


17 33 5 32 35 699 


27 13 17 31 43 712 


" — d 
— . x | 17 168 130 15 158 
cn М= 100 
929204 567 632 525 568 593 SX = 2,885 
DX? = 92,241 
Computations 
Xx „(5 х) quo: (4 x 2,082,171) — (2885)? 
b"(rows) = = = = 54.59 
N 100 
с(гп 2 
Ухуд cog) = Ex) — (5) _ Gx ene x — (2885)? 3030 


Exi? хына — Exp%eors) — HD Exu? = 9008.75 — 54.59 — 308.30 — 6802.40 
= 1843.46 


N r n 
BEN eg "EX -EDEN _ (5 х 92,241) — 427.199) _ 6802.40 
n 5 


Xue МУХ — (ZX)? (100 x 92,241) — (2,885)? _ 9008.75 
N 100 
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Notation. In the tables, r is the number of rows and c is the number of columns. 
When used as superscripts, r and c indicate that summation has been over rows 
or over columns. In Table 14.4, п refers to the number of cases within cells; 
that is, the number of replications. When used as a superscript, п indicates 
summation within cells. When rows or columns are summed, summation is 
over cell totals, so that the sum of any array is indicated with double summation 
sign, while summation over the entire matrix may be indicated with triple sum- 


ren 
mation signs, as XXXx,? (in which, as always, the order of summation is from 
right to left) or with a single summation sign if all cases can enter directly into 
the sum, as in ХХ, X X?, and Xxz. 


TABLE 14.5. TWO-WAY ANALYSIS OF VARIANCE 
(Data from Table 14.4.) 


c ғ r r 

r 1 2 3 4 5 EX xx? Exu? 
1 46 19 27 18 30 140 4430 510.00 
2 30 43 27 28 31 159 5223 166.80 
3 32 39 24 24 26 145 4373 168.00 
4 29 50 34 35 25 173 6347 361.20 
c — 222252 

УХ 137 151 112 105 112 


N 
DX = 617 
1 


ne жек N 
" LIX? = Уух =X? = 20,373 
ZX? 4881 6231 3190 2909 3162 Exo? (rows) = 1206.00 


с 
Уха? 188.75 530.75 54.00 152.75 26.00 хь? (сою) = 952.25 


с 
Sors ÈX) (EX? (137)2 + (151)? + (112)? + (105)? + (1129: (617? 


c N 4 20 
= 386.30 
r 
2 2 2 2 
Халы = у; (00° (®Х) (140)? + (159)? + (145)? + (173)2 (617): 
N 5 20 
= 132.55 
МУ, Х? — (y E 2 
Xx? = NAX? — (XX)? — 20 x 20373 — (617) АЕ 
N 20 


Check: By rows Check: By columns 


Ex? = 132.55 Ixy? = 386.30 
Xx? = 1206.00 Zxw? = 952.25 
Ux = 1338.55 Xx? = 1338.55 


Remainder sum of squares — 1338.55 — 386.30 — 132.55 — 819.70 
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SUMS OF SQUARES d ғ F 
Between rows (methods) 132.55 (r—1)= 3 44.18 Ғ<1 
Between columns (tutors) 386.30 (c—1)— 4 96.58 141 
Remainder 81970 (r-1(c- 1) = 12 68.31 
TOTAL 1338.55 (N—1)—19 


Computations. More computations are presented in Table 14.5 than would 
ordinarily be the case with a two-way analysis with a single replication. The 
computations show: 


1. XY and XX? for both rows and columns, with the superscript ғ or c indi- 
cating whether the summation has been in rows or columns; 

2. The sum of the squares within each array as found by Formula 14.2. These 
sums have been summed to find the within-groups sum of squares, Xixu?, 
for both rows and columns. 

3. Sums of squares between rows and also sums of squares between columns, 
Xx?, as found by Formula 14.6a. It is seen that for both rows and for col- 
umns, the between-arrays sum of squares plus the within-arrays sums of 
squares equals the total sum of squares. This information is presented as a 
check. А 

4. In this analysis, the total sum of squares is divided into three portions: 
Ухьз омо, ХХ columns), and the “remainder” sum of squares, or the error 
term. In a simple analysis, Xx? is used as the denominator for a single F. 
In a two-way analysis of variance, the remainder sum of squares is used 
for two F tests, one to test whether variation between rows (that is, among 
methods) is significant and the other to test whether variation between 
columns (that is, among tutors) is significant. 

5. The degrees of freedom for rows is (r — 1); for columns, (c — 1); and for 
the "remainder" term, (r — 1) (c — 1). It is to be noted that (r — 1) + 
Сс-іІ)-(ғ-І(с-іІ)-(М- 1), the total number of degrees of freedom. 


Analysis of the Complete Set of Data. Computations leading to an analysis 
of the complete set of data are shown in Table 14.4. The new feature (as com- 
Pared with Table 14.5) is a remainder, ог error, ог residual sum of squares 
Computed within the cells. This is used to form the denominator variance for 
testing not only the significance of the variation among the two sets of means 
(which, as in Table 14.5, are not actually computed), but also the “interaction” 
of the two independent variables. This “interaction” sum of squares corresponds 
to the error, or remainder, sum of squares in Table 14.5, where there is no 
Teplication. 

In both simple and complex analysis of variance, | 
test is based on the unexplained, or error, sum of squares; that is, that portion 
Of the total sum of squares not ascribable to identifiable "sources" of vari- 
ability. In Table 14.2, the “unexplained” variance is within groups; in Table 
14.5, it is that portion not associated with either the row means or the column 
Means; in Table 14.4, it is “within cells." This explains in part why the remainder 


the denominator of the F 
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sum of squares in Table 14.5 is analogous to the interaction sum of squares 
in Table 14.4. 

The between-rows and between-columns sums of squares are found by modify- 
ing Formula 14.6a as follows: 


ғ с 


Хх 1200208 — EXP 


N (14.6b) 
ern iis a 
Ухь саз = ш = (14.6с) 


Actually, the mode of computation is the same as in simple analysis of variance, 


гп 
the change in notation resulting from the fact that a column sum is now Dox 


n cr 
(instead of XXu) with Y X fora row sum, and from the fact that the л of 
Formula 14.6 becomes nr. For ease in computation, the two terms of the formula 
have been placed over N, utilizing the fact that N — rcn. 


rcn 
The within-cells sums of squares, Б} 2xv?, is analogous to the within-groups 
Sum of squares of Example 14.1 except that it represents variation around cell 
means instead of around means of entire columns. It can, in fact, be found from 
Formula 14.2, summed over rows and columns as follows: 

гсп с S 2 $ 2 

с r 

БУУ iEn Qux) (14.22) 


in which the Summation of X and X? is within each of the cells. 
An alternate formula, more convenient in use because it involves only Xx? 


* n 
for the entire problem and the squares of the cell sums (УЖ), is 


N ren 
5552 - aX -22 QA (14.2b) 
n 


This formula is applied in Table 14.4 


The “interaction” sum of squares may be found by subtracting from the total 


Sum of squares those sums of squares between rows, between columns, and 
within cells, 


The analysis for Table 14.4 now takes this form: 


SUMS OF 

SQUARES df Ld F 
Between rows 54.59 r—1= 3 1820 F<1 
Between columns 308.30 с-і- 4 77.08 Е<1 
Interaction 1843.46 (r—1)(e— 1) = 12 153.62 1.81 
Within cells 6802.40 N-— rc = 80 85.03 


TOTAL 9008.75 М-1-99 
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The estimated variance based on the within-cells sum of squares is the denom- 
inator for all three F tests. For rows and columns, F’s are less than 1.00 
and are clearly insignificant. Hence, if rows represent methods of instruction 
and columns represent tutors, we may conclude that in this study neither had 


a demonstrable effect on outcome. | 
The value of F for interaction of methods and tutors is 1.81. Reference to 


Table Fs (Appendix) indicates that for 12df in the numerator variance and 80df 
in the denominator, an F of 1.88 is required for significance at the .05 level. 
Hence we can conclude that interaction variance also is within sampling error 


and is statistically insignificant. 


Two analyses are presented. One, with a single observation in each cell, 
demonstrates how two F ratios may be found when there are two indepen- 
dent variables. An F ratio is computed for each of the two sets of means, 

The second analysis finds three F ratios, one for each of the sets of means, 
and a third for the interaction of the two independent variables. An error 
term, or residual term, for the denominator of the F ratio becomes available 
for testing interaction when there are a number of replications in each of 
the cells, in this case, five. 

Example 14.4 thus extends the basic technique to the case of two 
independent variables and investigates not only the effect of each of them 
separately on the dependent variable but also their effect in combination. 


SUMMARY 


A group of techniques collectively called the analysis of variance has 
found wide application in psychological and educational research. 
Basically, the question investigated is whether or not the means of a 
dependent variable in groups and subgroups vary more than could be 
expected on the basis of chance sampling. If greater variation than could 
be expected is actually found, a nonzero relationship between independent 
and dependent variable is indicated. The use of the F distribution, the ratio 
of two independent estimates of the population variance, is the most 
distinctive characteristic of the analysis of variance. 
Two simple cases are considered in this chapter: 


1. The case of a single independent variable; and 
2. The case of two independent variables. 


In the two examples, adequate selection of cases and appropriate 
Control of extraneous variation is assumed. More complicated cases are 


discussed in advanced books on experimental design. 
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EXERCISES 


1. For the following six groups of five cases each, k=6, n=5, N=30. By 
computing the total sum of squares, the between-groups sum of squares, and 


the within-groups sum of squares, show arithmetically that Xx? =x? + 
Хх, 


GROUP A GROUP B GROUP С  GROUPD GROUPE GROUPF 


7 Й 7 8 6 9 
5 9 4 10 8 11 
9 8 5 2 7 3 
1 6 6 6 5 7 
3 5 3 4 4 5 


. In a large psychological clinic, four psychologists were assigned cases at 
random until each had tested 20 patients. The following means and standard 
deviations for obtained І.О.5 were observed. 


PSYCHOLOGIST M 8 

А 90.7 14.0 
B 87.9 15.0 
С 92.0 13.5 
р 91.8 13.7 


The standard deviations were found by the formula s = 4/2x?/n, in which 
n is the number of cases in the group. Accordingly, 


n 
хь? = ns? 
Test whether the means differ significantly at the .05 level. 


3. By Bartlett’s test for homogeneity of variance, test whether the variances in 
Exercise 2 can be considered homogeneous. 

4. Ina learning study, Hull (3) tested three patients in each of three diagnostic 
categories. The criterion was the number of minutes required to form associa- 
tions between 12 Chinese characters and 12 spoken nonsense syllables. Test 
Whether the group means differ significantly at the .05 level. 


CONSTITUTIONAL DEMENTIA 


INFERIORS PRAECOX PARETICS 
89 45 200 

170 52 150 
72 39 140 


5. How many Observation: 
with 25 replications? 


6. A simple analysis of variance study used 45 subjects divided randomly and 


equally in five groups. The total sum of squares was 2,319.6 and the between- 
groups sum of squares was 642.8. Compute and evaluate F. 


S would be required in a 2 x 3 factorial experiment 


ANALYSIS OF VARIANCE 
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7. For the following table, based on a two-way classification, determine which 
of the three F are significant at the .01 level or better. 


SOURCE OF SUMS OF 
VARIATION SQUARES df 
Rows 7,382.7 1 
Columns 9,111.8 2 
Interaction 672.3 2 
Within cells 21,357.8 43 
TOTAL 48 


8. Data for the nine cells below were found as follows: 


1. At hazard, 90 values were drawn from the Rand table of 100,000 normal 


deviates (6) and assigned to the nine cells, ten cases in each cell. 


2. The scores in each cell were converted linearly to two-digit scores with no 


negative values. 


3. Systematic differences in both rows and columns were introduced by 
adding constants within cells as follows: 


CONSTANTS ADDED 
WITHIN CELLS 


Accordingly, one might well expect to find significant F's. Make the appro- 
Priate tests for significant differences for rows, for columns, and for inter- 


action. 

COLUMNS 

ROWS 1 2 3 

косы Зе места чог e е eS 

1 24 25 16 20 38 33 
34 14 2| 29 25 44 
23 12 21 40 2 п 

4 d 38 24 44 16 

48 28 117525 30 48 

2 26 21 17 28 21 2 
31 30 14. “ТІ 26 37 
11 13 16 34 22 40 
18 п 32 26 28 29 
ЗІ 279 18 30 32 41 

3 14 29 4 21 38 32 
21 22 30 32 19 46 
27 34 25. 12 20% 
24 16 4 2 20 34 
48 16 19 12 29 3$ 
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STATISTICS 

IN TEST 
CONSTRUCTION 

AND INTERPRETATION 


15 


A psychological test is valid to the degree to which it measures an aspect 
of human behavior, and reliable to the degree it yields consistent results, 
either from time to time or through alternate measurements. 

Some psychological tests yield only a global score, such as the number 
of seconds required to complete a task, or the number of pegs turned in a 
fixed amount of time. Such tests may be normed, correlated with a criterion 
to determine validity, and correlated with alternate forms or with retest 
data to estimate reliability. However, since they are not composed of 
Separately scored items, they are not amenable to item analysis, one of the 


major topics of this chapter. 


HOMOGENEOUS AND HETEROGENEOUS TESTS 
Psychological tests are generally composed of discrete items, which may be 
Considered as little variables entering into a sum variable, the total score. 
Some tests are homogeneous, that is, made up of items that tend to measure 
the same basic characteristic, as indicated by positive interrelationships. 
However, if the intercorrelations are too high, the test would be internally 
Consistent, but it would not constitute a scale making a large number of 


useful discriminations among people. | 
Other psychological tests are heterogeneous, that is, composed of 
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items having relatively low intercorrelations and measuring different 
aspects of behavior. If the items have relatively high relationships with a 
criterion and relatively low relationships with one another, the resulting 
test may be far more efficient for a practical purpose (such as selecting 
candidates for a position) than would a homogeneous test. However, 
heterogeneous tests are of limited scientific interest, since the characteristic 
measured can be stated only in terms of a criterion. Often they are single- 
purpose tests, useful only for prediction in a few specific situations. The 
homogeneous tests are more versatile. They can be used in a variety of 
ways because they are inherently meaningful and measure variables of 
psychological interest. 


ITEM STATISTICS 


When an item is considered a variable, it is most often dichotomous, 
taking values of either 0 or 1. As a dichotomous variable it has a mean of 
р, in which p is N,/N, the proportion of individuals passing. It has a 
variance of pq, in which q is the proportion of individuals failing the item. 
The standard deviation is V pq. 

For item intercorrelations the phi coefficient is appropriate, and if the 
covariance is needed, it is found by the formula 


P. 
C; = 7 рр) (15.1) 


in which Р, is the number of cases passing both item i and item j and in 
which p; and p; are the item means. This covariance, divided by the pro- 
duct of the two item standard deviations, yields the phi coefficient, or pro- 
duct-moment correlation, between the two items (discussed in Chapter 9). 

The correlation between a dichotomous item and a continuous variable, 
Such as the total score in the test of which the item is a part, or an outside 
criterion, is generally the point biserial r, which, as was pointed out in 
Chapter 9, is also an algebraic variant of product-moment r. The correla- 
tion between an item and the total Score to which it contributes may be 


denoted as г, and the correlation between an item and an outside criterion 
may be indicated as Ре 


CONCEPT OF RANDOM ERROR 


Much of the theory of psychological tests centers around the concept of 
random error. By definition, random error is completely uncorrelated with 
the true score in the test and with random error in any other set of measure- 


ments. This is a logical postulate, which can be used to examine test 
reliability and related concepts. 


Let X, the total score on a test, be represented as the algebraic sum of a 
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true score 4 and an uncorrelated error component £. Then, 
X=A+E (15.2) 


from which, by summing and dividing by N, the relationship of the means 
is found: 
M,=M,+M, (15.3) 


Subtracting Eq. 15.3 from Eq. 15.2 yields the relationship of the devi- 
ations: 
x=ate (15.4) 


Squaring Eq. 15.4, summing, and dividing by N yield 
Sx? Ха? 2Xae Xe 


W ua йр N 

However, Хае/М is the covariance between the true score and the error 

Score. Since the corresponding correlation, by definition, is zero, the 

Covariance must be zero, and can be dropped. The other terms can be 
Written as variances: 

И = V, + У, (15.5) 


That is to say, the total variance is the sum of two component parts: the 
true variance and the error variance. If all terms in Formula 15.5 are 
divided by V,, it is seen that the total variance, taken as 1.00, is divisible 
into a proportion that is true or reliable variance and a proportion that 
is the variance of the random error component. 

Consider two tests that are theoretical equivalents in that the true parts 
Of each test measure exactly the same function and are perfectly correlated, 
but which also contain a certain proportion of random error. By definition, 
the random error in one test is uncorrelated with the random error in the 
Other, ПЕ : 

A. form cognate with Eq. 15.4, but distinguished by primes, can 
Tepresent the equivalent test: 

x =a +е (15.4a) 


TEST RELIABILITY 

The Correlation between x and x’ can be taken. as the reliability, or con- 
Sistency, or self-correlation of the test. To estimate it algebraically, we 
multiply Eq. 15.4 by Eq. 15.4a, sum, and divide by N. This gives 


, 


Ухх Хаа Хае Уае Lee 
= ==; dee: (15.4b) 
N | ME MAN RE: 
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Since Za'e|N, Хае |М, and Хее |М are covariances corresponding to zero 
correlations, they are all zero and drop out. Xaa'/N is, by definition, the 
numerator of a perfect correlation, and is therefore equal to the product 
of the two standard deviations, s, and są. However, by definition, Sa = 54% 
Accordingly, Formula 15.4b can be rewritten as 

zxx' 
9 =V, 

Dividing both sides by 5,5, or V,, since the standard deviations of the 
equivalent tests are equal, 

Ухх! V, V,—V, 


St ee 15.6) 
Nas “ x Б A ( 


which can be read: “The reliability of a test, г, is Va/ Vx the proportion 
of the total variance that is true variance, or 1 — V,/V,, that is, 1 less the 
proportion of error variance.” 

As applied to a psychological measuring instrument, reliability may be 
defined as its consistency, or the degree to which it correlates with itself. 
While this correlation is unknown and unknowable, it can be estimated by 
four different methods: test-retest, alternate forms, “split-half,” and 
rational equivalence. Lack of reliability is conceived as the result of the 
presence of random error. A test consisting exclusively of random error 
would have no correlation with its theoretical equivalent test and would 
have a reliability of .00, while a test that included only “true” variance 
would have a reliability of 1.00. 


TEST-RETEST METHOD 


In the test-retest method, a group of individuals is given the same test on 
two different occasions. The ordinary product-moment r is computed and 
is used as the estimate of the reliability. 

For some tests, which have no alternate form and which consist of a 
total task rather than a set of separate subtests or items, test-retest is the 
only feasible method. It is a relatively poor method for tests in which there 
is a good deal of learning from one exposure to the next, and on which 
subjects vary considerably in how much they learn. However, a test-retest 
correlation, at least for a motor task, is probably a “lower bound” of the 
true reliability, which is likely to be higher than the obtained coefficient. 


ALTERNATE FORMS 


The method of estimating reliability by the use of alternate forms requires 
at least two forms of a test, which are considered to be equivalent. The 
forms are administered to the same group of subjects and results are 
correlated. As does test-retest, the method of alternate forms tends to 
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underestimate rather than overestimate the reliability. The reason is that 
with most material, it is difficult to build two forms of the same measuring 
device that are precisely equivalent both in assaying the same function and 
in including the same proportion of random error. However, when 
alternate forms exist, the correlation between them must be considered an 
essential reliability estimate to report. 


"SPLIT-HALF" RELIABILITY 

The estimation or reliability by either test-retest or alternate forms is 
applicable to most kinds of psychological tests. Both methods are applicable 
to "speed" tests, which are so timed that very few subjects complete all 
items and in which the difficulties of the items are relatively low. They are 
also applicable to “power” tests, consisting of difficult items, but generally 
with enough time allowed so that subjects can attempt all items. 

The *'split-half" method, now to be described, applies only to power 
tests. It is useful when no alternate form exists and when it is not feasible 
to administer the same test twice to the same group of subjects. 

The test is divided into two halves, which are judged to be equivalent. 
Sometimes two groups of items are equated on the basis of their item 
difficulties; sometimes the score on the odd items (1, 3, 5 ++) is considered 
the score on one-half of the test; the score on the even items (2, 4, 6 =), 
the score on the other half. This latter procedure is sometimes described 
as “о44-еуеп” reliability. 

The scores on the two halves are now corre 
reliability of one half of the test. 

To infer the reliability of the whol 
two halves are truly equivalent and t 
typical of the six correlations that wou 
tests, if two more half-tests were available. 

Let Za, Zar, 2, and Zp Бе z scores on four equivalent half tests, of which 
only a and a' (halves of X) actually exist. Both in raw-score form and in 
Z form, all variances of the half tests are considered equal, and all inter- 
Correlations are taken as гш, the observed correlation between the halves 
of X. The problem is to infer the correlation between the complete, 
existing test Х and its hypothetical equivalent, X' or b+ b'. In a per- 


missible deviation form, 
X = Za Za 159) 


lated, yielding the estimated 


e, it is necessary to assume that the 
hat the correlation between them is 
ld be obtained among four half- 


and 
х= Zp Zy (15.7a) 


Multiplying Eq. 15.7 by Eq. 15.7a, summing and dividing by N, yield 
Ухх  Xzzy , У242, " XzyZp ^ Х2,26 ый 


a BN N N = 4а (15.8) 
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Squaring Eq. 15.7, summing, dividing by N, and replacing z score 
variances with unity give 


—-ux2242n, (15.9) 


which is also V, 
The correlation is found by dividing Formula 15.8 by Formula 15.9. 
Thus, 
Ухх -— Cue Mom 2ш (15.10) 
Ns F Po 2k iT 
which indicates that the reliability of a variable when doubled in length 


may be estimated as twice the reliability of the half (r,+) divided by 
(1 + ra). 


SPEARMAN-BROWN “РАОРНЕСҮ” FORMULA 


If Eqs. 15.7 and 15.7a are extended to the general case, they may be 
written as 


X= Za + „++ Zye (15.7b) 
X! SZ, + tz (15.7c) 


On the premise that all n components are z scores with unit variance 


and that all (n? — п) intercorrelations are equal to r,,, the variance of 
either variable is 


Ve = Vy =n + (n? n (15.11) 

By multiplying Eq. 15.7b by Eq. 15.7c, summing, dividing by N, and 
taking each of the n? covariances аѕ Faass it is found that 

CES d) (15.12) 


Dividing Formula 15.12 by Formula 15.11 and simplifying yield the 
general Spearman-Brown “prophecy” formula for reliability: 
©. Сш nn nr, 


tá ра а 15.13 
Se Vy (е? + (n? —nnu 1-(п- Dray ( ) 


By substituting 2 for n, it can be seen that Formula 15.10 is a special 
case of Formula 15.13. 

With the type of material that goes into conventional aptitude and 
achievement tests, it has been found that Formula 15.13 works reasonably 
well, provided (1) the additional material is similar to the old, (2) the 
revised test is not very long, and (3) the extra material does not materially 
change the task for the Subjects. A very long test can become boring, and 
loss of interest can affect reliability adversely. 
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INFLUENCE OF TEST LENGTH ON RELIABILITY 

The development of a reliability formula in terms of true variances and 
error variances may be helpful. To the original test as indicated in Formula 
15.4, x, =a, + сү, let there be added any amount of new material, each 
unit of the new material having the same proportion of true and error 
variance as the original test. In deviation form the expanded test can be 
Tepresented as 

X=Q,; +e; +a, +e, + +а, +e 


This expression is squared, summed, and divided by N. All covariances 
involving error are zero and disappear. All п true variances are equal, as 
are all л error variances. АП (n? — n) covariances of two a terms approach 
V, as a limit and are considered equal to V,. 

By making these substitutions, 


V, = n? V, + nV, 


It is obvious that the true variance in V, is n? V, and that the proportion 
of true variance to the total variance, or the reliability coefficient, гу, is 


Eu mM 18 (15.14) 
сй пу, + nV, nV, V, 


If, for example, a test consists of ten items, each of which is .7 error and 
3 true variance, the reliability is 


10(.3) 


= — = 2 = .81 
== 1063) +7 3:7 


p 
Actually, Formula 15.14 is merely an algebraic variant of Formula 15.13, 
as сап be readily seen by substituting the unit reliability raa for V, and 


a- ға for V,. 


RELIABILITY BY KR FORMULA 20 (RATIONAL EQUIVALENCE) 


A Procedure that has similarities to correlating two halves of the same 
test and correcting the obtained correlation to estimate the reliability of 
the whole is the method of rational equivalence, using Kuder-Richardson! 
Formula 20. As is the method of alternate forms, it is based on the con- 
Sistency of different measurements of the same general function. However, 
instead of reflecting the consistency of different forms, it is based on the 


Consistency of items within a single form of the test. 


ha Lc e 


1 Originally derived by Kuder and Richardson (2), using somewhat different 
assumptions, 
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EXAMPLE 15.1 


TEST RELIABILITY BY THE METHOD OF RATIONAL EQUIVALENCE 


Purpose. The method of rational equivalence (using KR Formula 20) estimates 
the correlation of a test with its hypothetical equivalent. It assumes that the 
variance of the existing test is identical with the variance of the hypothetical test, 
and that the sum of the item covariances within the existing test is proportional 
to the sum of the between-test item covariances. 

Data. In Table 15.1 are shown the correct and incorrect responses of 25 college 
students on the 15 items of a mathematics test. Correct responses are indicated 
by 1, incorrect by 0. Total scorgs are shown under X, while the number passing 
each item (Np) is shown under each column of item responses. The p values 
(not shown) would be found by dividing each Np by М, the total number of cases. 

Item variances are shown in the row designated as pq. It may be noted that 
2 of the 15 items make no discriminations at all (since their variances are .00) 
and 4 others have variances less than .08. Dropping items 4 and 5 would have 
no effect on test characteristics other than the mean, while the elimination of 
items 2, 3, 6, and 7 should not greatly change test reliability and validity. 

The following computations summarize the information in Table 15.1: 


N=25 УХ =XN> = 301 XN, = 6255 
п = 15 УХ? = 3729 Ора =}V; = 2.032 


_ 25 x 3729 — (301)? 


w= F = 4.20 


By Formula 15.20, 
n | Уи 15r 2.032 
"RR XR) HA аш) 777 
By Formula 15.20a, 


n МУХ УМ, M 25 x 301 — 6255 
NXx:—(Ex»] 14\  25x3729— BOL?) - 


Tax 
n—1 


As estimated by KR Formula 20, in this small sample the reliability is .55. 
This low reliability partly reflects the short length of the instrument, since in 
effect it has only nine items. 

In finding ге, by KR Formula 20, the complete item matrix, as in Table 15.1, 
is seldom displayed. Various mechanical devices (scoring machines, tabulating 
machines, and electronic computers) can produce item counts (Np values) more 
or less automatically. In the absence of such a mechanical device, a hand method 
can be readily designed. From the №» it is necessary to find either XV; or N52. 
The variance is computed from М, УХ, and У Х?, or froma frequency distribution 
of total scores. 


Like the split-half method, it is not applicable to speeded tests, since 
it would give results spuriously high. While test-retest and alternate forms 


TABLE 15.1. ITEM SCORE MATRIX FOR 25 STUDENTS ON A 15-ITEM TEST 


ITEMS 


12 13 14 15 


11 


10 


STUDENTS 


14 
13 
15 
12 


11 
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are applicable to both homogeneous and heterogeneous tests, the method 
of correlation with a theoretical equivalent can be used only with homo- 
geneous tests that have internally consistent items. If applied to a hetero- 
geneous test, developed to predict an external criterion, the resultant 
coefficient would be an underestimation of the true reliability. 

The method is based on two assumptions: 

1. That the total variance of the hypothetical test, V,-, is equal to the 
variance of the existing test, V,. 

2. The covariances of the items within the existing test are representative 
of the covariances between the items of the existing test and the items of the 
theoretical test. 

The second assumption may be so interpreted that the mean item 
covariance within the existing test can be taken as the mean covariance 
between items, one of each pair coming from the existing test and the 
other from the hypothetical test. 

The item scores (in deviation form) of the existing test can be represented 


as 
x=itjteortn (15.15) 
and of the hypothetical test, 
pete irae nt (15.16) 


Multiplying Eq. 15.15 by Eq. 15.16, summing, dividing by N,and writing 
resultant terms as covariances, we have 


Cu = Cii ар Crj а жарын = Can 


With n items in each of the two tests, there are n? covariance terms. 
Denoting their mean as Сү, we have 


безен Сі, (15.17) 


The correlation between x and x’ is found by dividing both sides of 
Formula 15.17 by 5,5',. However, since V, = Vys, s,s,. = V,. Then 


=r =} (15.18) 


The next step is to estimate C;;. from the internal characteristics of 
the existing test, since the average item covariance within x (that is, C;;) 


is taken as the average covariance between items in x and items in x’ (that 
is Cip). 
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Returning to Eq. 15.15, we square the expression, sum, divide by М, 
and obtain 4 


Exe xi? Ж хі Хіп 
N мМ N 
Xi. x Ујп 

+ oo " 


Xin z Xjn "e xn? 
N N N 

It can be seen that on the right-hand side there are л variance terms, 
Vi, Vje- V, Their sum can be written EV; (or Es?). There аге also 
(1? — n) covariance terms, such as С, Only their mean, C;j, is of interest, 
and their sum is (n? — n)C;;. 

Accordingly, 

y, = EV, + (n? — nC; 

Solving for Сү, 

Eum (15.19) 


UU gw —n 


Substituting Formula 15.19 in Formula 15.18 yields 


nV, — XV) 
ae ЕТ p. dE eea (1-75) (15.20) 


ГА п-п a 


This is Kuder-Richardson Formula 20, which requires three bits of 


information about a test in order to estimate its reliability: п, the number 
of items; XV, the sum of the item variances; and Vy, the total variance. 
Its use is not restricted to tests composed of dichotomous items; rather, 
the scoring system applied to the items may 
total score is the simple sum of the item scores. 

If items are, in fact, dichotomous, and the score is either 1 ог 0, then 
Formula 15.18 can be simplified for easier computation. It will be remem- 
bered that for dichotomous items, the sum of the scores equals the sum of 
the Squares of the scores, both with value of N,. Accordingly, 


have any range as long as the 


ГА Np Ny 


pg ge (15.21) 
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Summing Formula 15.21 and noting that УМ, = EX, the sum of the 
total scores, we have 


1 1 N 1 
ZV = №, – з IN, = ya =X — түз XN (15.22) 
Dividing Formula 15.22 by V,, or rather by its raw-score equivalent 


(/N?)[NZX? — (ZX)^], and substituting the resulting quotient in Formula 
15.20 yield 


(15.20) 


Vyxt 


n | МУХ —EN,? ) 

п-і1 МУХ? — (ХХ)? 
The use of this formula is shown in Example 15.1. 

EFFECT OF CHANGES IN RELIABILITY ON r 


Two principles already demonstrated can be used in estimating the effect 
of changes in the reliability of either (or both) of the variables entering 
into a correlation. These are: 


1. The fact that any variable can be considered the sum of two components, 
the true score and the random error. Thus, by Eq. 15.4, x =a + е. 
2. The fact that the ratio of the true variance to the total variance equals 


the self-correlation, or reliability, of the test. By Formula 15.6, г = 
Val Vy 


Let x” be a variable that includes a, the true score in x, but which, 
instead of e, has an alternate random component e". Similarly, let y" be a 
variable that includes b, the true score in y, but which, instead of е”, has a 
different random component e". The covariance between these variables is 
found by multiplying (x" =a + е") by (y" = b + е"), summing, dividing 
by N, and dropping out covariances involving random error. Thus 


c LEX ыз ты Bae" Р 
UY UN OON N N N 9 


By a similar development involving (x = a + е) and (y =b + e’), it can 
be seen that the covariance of the original values, C,,, is also C,,. Hence, 


ey = Суу (15.23) 


By analogy with Formula 15.6, the reliabilities of x" and y" are in each 
case the ratio of the true variance to the total variance; that is, r.. = 
Va| Vx and 

y, 
Vo = — (15.24) 


xx" 
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From Formula 15.6 it is seen that V, = V,r,,-. Substitution of Р,ғ,,, for 
V, in Formula 15.24 yields 


Vi xx 
Vy = == (15.24a) 
Г" 
Similarly, the variance of y" is 
Ут, 
Vp = (15.24b) 
туу, 
in which r, is the original reliability and ғ,» is the modified reliability. 
To estimate 7,,, the correlation between x" and у”, Formula 15.23 is divided 


by the square roots of Formulas 15.24a and 15.24b. Then 


NN жәнен eum ыы == [== 
= к. s Шс ean, 

x! | Кугу, JV V,N ттуу Тұх Туу 
" Туу" 


Vr. 
Fx 


(15.25) 


in which гуу, and гуу, are obtained reliabilities and r,, and ry," are new or 
hypothetical reliabilities. With the restriction that values over 1.00 are 
conventionally taken as 1.00, Formula 15.25 gives a satisfactory indication 
of the effect of changes in the reliability of one of (or both) the variables. 
If the reliability of only one, say х, is changed, then ry, =T,» and the 
formula becomes 


Fg = raf Ter (15.25a) 


which estimates the change in the correlation between x and y brought 
about by changes in the reliability of x. 

A special case of Formula 15.25 is the correction for attenuation, used 
to estimate the correlation between two variables when both become 
Perfectly reliable; that is, when ry," = "yy" = 1.00. In that case, Formula 


15.25 becomes 


Try (15.25b) 


ee ol 
in J rev 2 


The use of these formulas is shown in Example 15.2. 


EXAMPLE 15.2 _ 


ESTIMATING THE EFFECT OF CHANGES IN RELIABILITY ON r 


Purpose. The Spearman-Brown procedure (Formula 15.13) estimates the effect 
Of changes in length on the reliability of a test. Reliability, of course, may also 
be modified by changes in item content or structure, but the effect of such 


15.2 
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alterations cannot be clearly predicted. The Spearman-Brown formula assumes 
item homogeneity. 

Formulas 15.25 and its variants can be used to predict changes in the cor- 
relation between two variables as the result of changes in the reliability of one 
of them or of both. 

Effects of Modifying Both Variables. Consider a variable X with rzz- of .80 
and variable Y with ryy of .70. If rz, is .25, what correlation would be anti- 
cipated if both tests were made half as long? What would be the correlation if 
both tests were doubled in length? 


In Formula 15.13, n can take fractional as well as integral values. If n = .5, 
then 


5 х .80 667 d .5 х .70 =i S98 
OT (SIO nd Caio er WD. ^ 
Substituting in Formula 15.25, 
д е ra] Үлт _ as [S7 х 538 _ ш) 
Гтт' уу, .80 x .70 
Similarly, if n = 2, 
2x.80 2x .70 
leu = i+ Q — 1.80 = ,889 and ry = 1+@—1.70 = 824 


Then, by Formula 15.25, 


Fey = on {= X.824 _ 95x 131.33 
.80 x .70 


Effect of Changing the Reliability of a Single Variable. Consider a predictor 
variable X, with reliability of .81 and validity of .45, for predicting a defined 
criterion. If the test were shortened so that the reliability became .64, what 
would be the expected validity? 


Estimating the Maximum Possible Effect of Changes in Reliability (Correction 
for Attenuation). Formula 15.25b, Spearman's correction for attenuation, esti- 
mates what a correlation would be between two variables that were made per- 
fectly reliable, If ray is .25, rz», is .80, and ryw, is .70, what is the limit of Fay if 


all error variance is removed from X and Y? 
By Formula 15.25b, 


Fry 25 
V ras гуу У.80 x 70 


Рау = 


334 


It is seen that, in this instance, the maximum possible correlation is not much 
greater than that anticipated from doubling the length of both variables. 

Correction for attenuation is actually a special case of partial correlation, as 
described in Chapter 8. It can be demonstrated that the correlation between the 
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random error component of variable X, ez, and X itself is 1/1 —rzz-. By defini- 
tion, the random error component of Y, ey, is uncorrelated with ez and with X. 
The matrix of correlations of ez, ey, X, and Y, with all variances taken as 1.00, 


is as follows: 


ez ey X T 
er 1.00 .00 МІ —rez .00 
ey 1.00 00] MT = њу 
X 1.00 Тәу 0 
к 1.00 


If the matrix operations described іп Chapter 8 are carried out, 


Fry 


"rr == 
dii Vrzz Муу 
which, of course, is rz, with ez and ey partialed out. 


STANDARD ERROR OF MEASUREMENT 
A practical interpretation of the consistency of a test is in terms of the 
Standard error of measurement. Essentially, the standard error of measure- 
ment is the likely standard deviation of the errors made in predicting true 
Scores when we have knowledge only of the obtained scores. True scores 
are, of course, forever unknowable, but if we know the standard deviation 
of the discrepancies in estimating them, we also know the degree to which 
We can trust the scores obtained from our tests. 

Such a formula can be readily found by solving Formula 15.6 (ыл 
l| — VV.) for V.. Then 

Ve = V. — Verse 

By taking the square root of both sides, we have 

(15.26) 


S, = Smneas. = Sel - T 
The expression s,/1 — r,," gives the standard deviation of ms 
Crepancies between true and observed scores. For example, for the b 
the standard deviation is approximately 15 and the reliability is of the 


Order of ,95. Accordingly, 
Smeas. = 15/1 — .95 = 15,/.05 mieu 


This means that about two-thirds of the discrepancies between observed 


L.Q.s and true 1.Q.s would be less than 3.35 I.Q. points. 

The use of the standard error of measurement often assumes that the 
error in estimating the true score is the same ІП all parts of the range of the 
Observed score. This by no means is necessarily true. In fact, the bivariate 
distribution of the two alternate forms of the 1937 Stanford Binet showed 


that the error of estimate was smaller for low LQ.s than for high LQ.s. 
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This is evidence that the standard error of measurement in either form 
followed the same pattern. Accordingly, in using the standard error of 
measurement to estimate the limits within which a true score might be 
found, we must temper our interpretation with the knowledge that Smeas. 
is a kind of average throughout the range and may not represent the situ- 
ation in the part of the range in which we may be particularly interested. 


CHANGES IN RELIABILITY RESULTING FROM CHANGES IN RANGE 

When there is reason to believe that 5meas, 15 in fact constant throughout 
the range, there is a way of estimating the effect of changes of range on 
reliability. Let s./1 — г. be Smeas. in the range in which the reliability of 
the test is known; and let 54/1 — r’ be Smeas. In a different range, estimated 


from a different standard deviation, s’, and different reliability, r’. Then, 
since the two variances of measurement are equal, 


Wü —r4)2V'(1-r) 


Solving for r' yields 

V = ru) 
ү” 

Consider a test with reliability of .84, in a range in which the standard 


deviation is 10 (and variance 100). What would be the reliability in a range 


in which the standard deviation is reduced to 8, and variance to 64? By 
Formula 15.27, 


Pede (15.27) 


_ 10001 — .84) _ 
a 


What would be expected to happen if the variance were increased to 
128? 


ym] “75 


. 100(1 — .84) 
128 


It is understood, of course, that the “changes in range” come from 
adding or subtracting cases to a distribution. Changes in scale brought 
about by linear transformations of the original variable have no effect at 
all on the reliability of a variable or, except for rounding error, on its 
correlation with other variables. 


r-i = .875 


CORRELATION BETWEEN TRUE AND OBTAINED SCORES 


To develop a formula for the correlation between true scores, a, and 


obtained scores, x, both sides of Eq. 15.4, (x =a + е), are multiplied by a, 
yielding 


ax =a? + ае 


When this is summed and divided by М, Cae drops out as the covariance 
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representing a zero correlation. Accordingly, 

Cas V, 
By dividing both sides by 5,5, and remembering that V,/V, is ғы», the 
reliability coefficient, we have 


Ca Va ы *,— 
E =. 
545% 5, Se 


which shows that the correlation between observed and true scores is the 
Square root of the reliability coefficient. 


TEST VALIDITY 


The criteria as to whether a test actually measures what it is supposed to 
Measure are logical rather than statistical. While psychological test theory 
recognizes other types of validity, the only type of validity considered here 
is that involving the correlation of the instrument with one or more outside 
Criteria. 

If a test has an outside criterion, measured along an interval scale, then 
the validity coefficient is merely the product-moment correlation between 
the test and the criterion. Such a coefficient is, of course, subject to the same 
Considerations in its interpretation as is any other product-moment r, 
Such as the question of linearity of regression if r is to be taken as an 
indication of the closeness of fit between the two variables; the question of 
homoscedasticity, if the standard deviation of the residuals is to be used as 
the standard error of estimate; and the usual questions regarding the 
adequacy of sampling and possible curtailment or expansion of the range, 
if the sample r is to be taken as a valid estimate of the parameter value. 

If the criterion is dichotomous, such as success or failure in a course of 
training, or whether or not a man receives a promotion in a certain length 
of time, it may be preferable to use a biserial r rather than a point biserial in 
Computing validity coefficients. This will be true if we are trying to estimate 
what the relationship would be if the criterion information were on a con- 
tinuous scale instead of in two categories. However, if we are using the 
Validities in computing a multiple regression equation for forecasting the 
Success of individuals for whom criterion information is not yet available, 
the point biserial is preferred to the regular biserial. 


EFFECT OF CHANGES IN LENGTH ON VALIDITY 
By changing notation somewhat, Formula 15.13 for the reliability of a test 
When lengthened n times can be written as 


= (15.13а) 


"= ----- 
Ға 1-(п- Drs 
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Dividing both sides by r,.. yields 


~ 


_ еу ee LL (15.28) 
rhe 1+ (п – 1), 

which сап be substituted in Formula 15.25а to estimate the correlation 

between x, when it is lengthened n times, and y, an outside criterion. The 

formula requires knowledge of r,,., the reliability of x in its unit length. 


Then 
n 
ЖҰ EN MESS (15.25) 
ЖҮ ч! + (п — rye 


Theoretically, п can take any positive value, including fractional values, 
but because of nonstatistical considerations, the formula may not yield a 
good estimate if п is very small or very large. It assumes, of course, that 
the proportion of random error in each equal subdivision of a test is 
identical. 

Consider a test with a validity of .30 and a reliability of .70. What would 
be its validity if (1) it were doubled in length; and (2) if it were reduced to 
4 of its present length? 

By Formula 15.25b, when n = 2, 


2 
Fy = -30 = (.30)(1.08) = .32 
ч гт £30008) 
and when n = .4, 


E “ы ^" 
Fey = 30 Vos = (.30)(.83) = .25 


If the two reliabilities are known and x is lengthened л times and y is 
lengthened n’ times, substitution of Formula 15.28 in Formula 15.25 
yields a formula for estimating the changed correlation between x and y: 


Fy = | E (15.25d) 
[1 + (n — Dr, J[1 + (n' — Dr,,] 


SCORING FORMULAS 


The simplest, and one of the best, methods for scoring a test made up of 
Separate items is to use merely the number of items correct. If omitted 
items, as well as items incorrectly marked, are counted wrong, then any 
weighted combination of rights and wrongs correlates perfectly with the 
number of rights. (The inconsequential and exceptional case is when rights 
and wrongs are added together, so that each score represents the number of 


questions, which would destroy the test as an instrument capable of 
making discriminations.) 
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The result of any effective scoring formula is to put items into three 
Categories: rights, wrongs, and omits, with each category contributing 
differentially to the total score. Since the number of items is a constant, 
only two of the three bits of information are needed (rights, denoted as R ; 
wrongs, denoted as W; or omits, denoted as O) with which to enter the 
formula. Generally, scoring formulas are in terms of R and W. 


SCORING FORMULAS: CORRECTION FOR GUESSING 
If the n items of a true-false or multiple-choice psychological test were 
answered at random by someone entirely lacking the characteristic 
Measured by the instrument, the expected number of items answered 
Correctly would be n/n’, in which n’ is the number of choices. Thus, in a 
true-false test (which has two choices), half the items should be answered 
Correctly by chance; with three-choice items, a third; with four-choice 
items, 25 percent; and so on. Obviously, the larger the number of choices 
in each item (provided each is equally likely to be chosen), the less sheer 
chance is likely to influence the result. 

It has been shown that the effect of guessing can be minimized by 
utilizing the following formula: 


Жуз. (15.29) 


n'—i 


a formula that is useful when there is a large number of omissions and 
When it is desired to treat omissions differently from the responses that 
that are definitely wrong. For true-false tests, Formula 15.29 becomes 


X=R-W (15.29a) 
and for five-choice multiple-choice tests, 
X ed (15.29b 
4 


EMPIRICAL SCORING FORMULAS 

In some types of tests it has been found that the psychological function 
Measured by R has a high negative correlation with the function measured 
by W. In such cases a scoring formula is of little value, since by giving a 
negative weight to W, the score merely includes more of the same kind of 


Variance already measured by R. ; ; 
In other instances, W has little variance compared with the variance of 


R, and here again there is usually little to be gained by using a scoring 
formula. However, one occasionally finds a situation in which W has 
Considerable variance, has a relatively low relationship with R, and has 


definite negative validity with a criterion. 
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If there are several such tests in a battery, and the battery has an outside 
criterion, the solution is clear: The rights and wrongs in each test are 
treated as separate variables and all variables are validated against the 
criterion. Raw-score regression coefficients applied to R and W in each 
test, then, constitute the scoring formula. 

A similar procedure can be applied to a single test used to predict a 
single criterion. Let O be the criterion. The regression equation may be 
written 


X,= 28 ВокюХк + 2o Bow.rXw + К 
SR Sw 
Since the constant K has no effect on the correlations of the summed 
variable, it can be dropped. Since it is convenient to apply the scoring for- 
mula only to the wrongs, the regression coefficients can be divided by the 
regression coefficient for the rights, and the weights to be applied to the 
rights can be made unity. If a is the weight to be applied to the wrongs, 


= SoSrBow.r E SrBow.r _ Sr(Yow — ГокГку) (15.30) 
SwSoBor.w — SwDonw  Sw(Tor — l'ow'nw) 


There is no need to use such a formula unless: 


- A substantial proportion of the items have been omitted, as in a speeded 
test; and 


- The intercorrelations of the three variables, R, W, and the criterion, 


are such that the multiple correlation of R and W with the criterion is 
appreciably higher than the validity of the rights alone. 


As with any multiple R, shrinkage is to be anticipated in applying an 
empirical scoring formula in a new sample of cases. Any such formula 
should be cross-validated on а new sample in order to determine whether 
the gain in validity in the original sample holds up. 


ITEM ANALYSIS 


DIFFICULTY OR POPULARITY ANALYSIS 


Item analysis refers to the computation of statistics describing individual 
items. Such statistics are frequently used in developing a useful test from a 
collection of experimental items; or in revising an existing test. The most 
common, the simplest, and possibly the most useful type of item analysis 
involves the measurement of the difficulty or popularity of particular 
responses. 

In an aptitude or achievement test, item difficulty is measured in terms 
of p, the proportion of subjects choosing the correct answer. If each item 
is scored 1 or 0, then p is the mean score on the item. Sometimes the entire 
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group of subjects taking the test is used in the formula 


while in other cases, N refers to the number actually attempting the item. 
The choice depends on which base will make p the best reflection of item 
difficulty. 

An important use of p is in determining the order of the items in the 
test as a whole or in a subtest. It is considered good practice to begin a 
Power test with one or two items that all subjects are extremely likely to 
Pass and end with a few items on which not many subjects are likely to 
succeed. The easy items are thought to help engender confidence. Ina power 
test, items passed or failed by all subjects add nothing to the ability of the 
test to make discriminations among subjects. In such a test, items should 
be arranged more or less in inverse order of difficulty, from p = 1.00 
toward p = .00. 

A second use of the p values of items is in the development of speeded 
tests. Speed in a function may be measured by a collection of items of 
uniform difficulty, of which p is an accepted measure. The number 
of items answered in a set amount of time can then be taken as a measure 
of speed. 

With items on structured interest and personality tests, p is a measure of 
Popularity rather than difficulty. Again, however, items with p values 
of .00 or 1.00 do not make discriminations among the respondents, and 
items with very low or very high p values make relatively few discriminations. 

As noted in Chapter 9 with respect to a dichotomous variable, knowledge 
of p leads directly to knowledge of the variance and the standard deviation, 
Since y — pq,in whichq — 1 — p. The variance (and the standard deviation) 
аге at a maximum when p =q = .50, but at first the variance drops rather 
Slowly as p changes. When p is .40 or .60, the variance is .24, as compared 
With a maximum of .25, and only at approximately .15 or .85 has half of 
the variance disappeared. What each item adds to the total variance of a 
test is a function of the item variance and the covariances of the item with 
all other items. However, if the covariances are too high, the item will add 
to the numerical value of the total variance without adding appreciably 
to the usefulness of the instrument in making discriminations among 
People. 


POPULARITY OF DISTRACTORS 

Ina Structured psychological test, each question has n' alternate answers. 
The P value reflects the proportion who choose the correct alternative. 
Asa preparation for revising the items, the proportions of subjects 
Choosing the several wrong responses (the “distractors” or “ decoys”) are 
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useful. If p is the proportion choosing the right answer among z' choices, 
then the ideal distribution of those choosing wrong answers is (1 — p)/ 
(n' — 1) of the cases for each wrong response. This would indicate that the 
distractors are equally attractive. 

If a decoy is not used at all, the item no longer has n’ choices, but in 
effect has only (n' — 1) choices. For the distractors to be maximally 
effective, they should all have a probability of being chosen by one who 
does not possess the characteristic measured by the item. While few items 
have the ideal distribution of wrong responses, tests can be greatly im- 
proved by using distractor analysis to indicate which decoys need to be 
replaced or edited. 


ITEM ANALYSIS: INTERNAL CONSISTENCY 


To develop a test that measures a single function, it is necessary to select 
a group of items with relatively high (but not too high) intercorrelations. 
If the intercorrelations are too high, each will measure the identical small 
aspect of the trait over and over again; and the test will not make the 
proper differentiations in the characteristic measured. The simplest 
method of choosing items with relatively high intercorrelations is to develop 
first a pool that is judged to measure aspects of the same general trait. A 
total score is obtained and each item is correlated with this total, desig- 
nated as t. It is readily seen that the covariance between each item and the 
total of which it is a part is a function of the item variance and the co- 
variances between the item and the other items in the pool. 

Let t=a+b+- +i, in which t is the total score and а, b» i are 
the item Scores, all in deviation form. By multiplying the expression by i, 
summing, dividing by N, and writing the covariances and the single vari- 
ance, we have 


C, = Cig + Co + i VI (15.31) 


We can divide Formula 15.31 by s;s,, multiply each covariance such as 
Cia by a term such as s,/s,, and write the correlation 


1 
тен (тез Қыс + s) (15.32) 
t 

Although Formula 15.32 reflects high item intercorrelations, items can 
be selected on the basis of other approximations, such as Formula 15.31. 
Another possibility would be to correct Formula 15.31 by removing the 
item variance. Thus we could compute for each item: 


Cie — V, = Cia Cg (15.31a) 


Items with high values of this function would have high covariances 
with other items. 
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It must be remembered, however, that г changes as items are added or 
subtracted. Consequently, if a function of the correlation or covariance 
between an item and the total score is used as a guide to the selection of 
internally consistent items, the process must be carried through several 
cycles, using a series of pools of items, each more consistent internally 
than its predecessor. 

Another technique is to select items in which some measure of internal 
consistency is maximized, such as Kuder-Richardson Formula 20; that is 


[556-52 


or the ratio of the sum of the covariances to the total test variance. This 
can be done by maximizing a part of Kuder-Richardson Formula 20, 
namely, EC;;/V,. This technique tends to pick a group of items measuring 
the same general function, and hence constituting a homogeneous test. 


ITEM ANALYSIS: EXTERNAL VALIDITY 
With heterogeneous tests, the characteristic to be maximized is the 
correlation with an outside criterion. Selecting items in the order of their 
Correlation with the criterion will tend to accomplish this objective. An 
Item-criterion correlation may be expressed as г, in which i represents the 
item and c the criterion. A solution somewhat better theoretically is to 
attempt to maximize the correlation between the total test and the criterion 
by choosing items with the highest item-criterion covariances. 
By a development parallel to that of Formula 15.32, 


(Cac + Coe tos Cie) (15.33) 


5:5. we 
Inspection of this equation shows th 
Covariances with the criterion c will resu 
formula for т. However, such selection does n е 
Tes Since the item intercorrelations are concealed within 5), 
Pattern has a marked effect on the validity of t. 


at selection of items with high 
lt in a high numerator of the 
ot guarantee the maximum 
and their 


ITEM SELECTION BY APPROXIMATIONS TO MULTIPLE R 

d by some approximation to multiple 
the basis of high correlations 
ithin the constituted test. 


Theoretically, r, can be improve 
Correlation, which would select items on 


With the criterion and low intercorrelations w ] 
Multiple correlation itself would be time-consuming to use as an item 


Selection technique, but its chief drawback is that its use would require 
fractional weights to be applied to the items in scoring the developed test. 
While it provides the best theoretical answer to the problem of developing 
an instrument to predict a single external criterion, it is seldom used. 
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Various approximations to the multiple correlation technique have been 
attempted, usually with the restriction that the permissible weights be 
either 1 or 0. An item with a weight of 1 is selected; an item with a weight 
of 0 is discarded. 

One such procedure is to select items according to the value of the 
ratio Fiel"; that is, directly according to their validities and inversely 
according to their correlations with the total score. A first group of items 
so selected can be used for a new cycle of selection procedures, using a 
new f. 

Actually, procedures involving the use of item intercorrelations for 
maximizing external validity have not been very successful. While it is 
relatively easy to maximize (or maximize approximately) the validity in 
a specific sample, the validity in subsequent samples tends to be little more 
than what might be attained simply by choosing the most valid items. 
Item correlations tend to be somewhat unstable from sample to sample, 
and a solution that is theoretically perfect may not be particularly useful 
in a practical situation. 

From time to time, proposals have been made to weight items differ- 
entially, with a rather wide range of weights. In theory the correlation 
between a group of items and a criterion is at a maximum when each 
item is weighted in accordance with its regression weight. Empirical 
attempts to do this have generally resulted in some increase in validity 
within the particular sample in which the key is developed, but little 


increase over simpler scoring procedures has been realized when the key is 
cross-validated in a new sample. 


TEST VARIANCE AND ITEM STATISTICS 


As are the variances and covariances of continuous variables, item 
variances and covariances are directly additive. The sum of the item co- 
variances between a given item and all other items in a test plus the variance 
of that item is precisely equal to the covariance between the item and the 
total score. Similarly, the sum of the covariances between items and the 
total score equals precisely the total variance. If all item variances and 
covariances are arranged in a matrix of п rows and n columns, with n being 
the total number of items, we have a convenient form for determining the 
characteristics of a test at any stage. By eliminating from the matrix 
those statistics referring to any item or group of items, statistics of the test 
with such items eliminated can be readily found, since all the statements 
regarding additive properties continue to hold. 


NORMING PSYCHOLOGICAL TESTS 


As pointed out earlier, psychological data are gathered in units such as 
number of seconds required to complete a task, number of errors, or in 
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psychological tests, raw scores, which may or may not involve a scoring 
formula. . 

In much laboratory work, raw data are graphed and otherwise analyzed 
without transformation. In most theoretical statistical work, however, 
variables are treated as though they were actually z scores, which are 
described in Chapter 5. 

Few raw scores on psychological tests have inherent meaning. They 
must be interpreted by means of a transformation, which serves two 
purposes: 


1. It permits direct interpretation of the score; and 
2. Scores from different tests or different parts of the same test are made 


comparable. 


Two general classes of norms resulting from such transformations are 
reference norms and statistical norms. Reference norms are those in 
which the raw scores are translated into terms directly significant. These 
include work norms, age norms, and grade norms. Statistical norms, 
which include percentiles, standard scores, and normalized scores, are 
mathematical transformations that are especially useful in test-to-test 
comparisons, but which in and of themselves have no direct meaning in 


terms of real life situations. 


REFERENCE NORMS 


Work norms have been used very little, probably because meaningful 
Work standards closely related to psychological tests are seldom found. 
However, performance tests in stenography and typing are often reported 


in number of words per minute. Sometimes the scoring system involves 


penalties for errors, so that the final score involves a statistical adjustment. 
de tests, designed as instru- 


Another example is the grading of the oral tra > 
ments to estimate the knowledge of individuals about a particular trade or 
occupation. Raw scores on these tests are interpreted in three categories: 
novice, apprentice, and journeyman. 

In age norms, often used with tests for children, the average performance 
for each age is determined and the raw scores are converted to age equiva- 
lents. In computing the mental age on the Stanford-Binet, the child is given 
a certain number of months credit based on his “basal mental age,” the 
highest level at which he passes all the tests. He is given additional credit 
toward his mental age for each additional item passed, each item having a 
value in terms of months. When a test is carefully standardized, this 
Procedure yields a score that is directly interpretable. Some group intelli- 
gence tests use a slightly different system, in which a raw score is first 
Obtained and is then converted, by means of a table of norms, to a mental 
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age equivalent. This same method is sometimes used with school achieve- 
ment tests of reading and arithmetic. 

For tests depending primarily upon learning in the elementary school 
situation, grade norms may be used in preference to age norms. Here the 
Taw score is converted into a grade equivalent, defined as the average 
performance of a particular grade level. Fractional grades are used for 
norming purposes, even though in the real school situation all grades are 
generally at least half a year apart. 

The use of reference norms implies that enough individuals have been 
measured so that stable and representative standards of performance are 
available. Unless the tests are reliable and the norms are stable, the 
Possibilities of misclassification are high. The estimate of reliability is 
best made in terms of the standard error of measurement, that is, the 
variability to be anticipated between obtained and true scores. 


STATISTICAL NORMS 


In previous chapters, three transformations usable for norming psycho- 
logical tests have been discussed. The following computational examples 
have been given: percentiles, Example 4.2; percentile ranks, Example 4.3; 
standard scores, Example 5.2; and normalized scores, Example 11.4. 

The three types of statistical norms differ chiefly in their distributions. 
If a large number of raw Scores, representing fine gradations of ability, 
are converted to percentiles, the distribution is theoretically rectangular. 
Between each percentile point, if the number of cases were very large and 
the gradations very fine, 1 percent of the distribution would be expected. 

If obtained scores are converted into standard scores and then dis- 
tributed, the result is a distribution similar in shape to the distribution of 
the original raw scores. The transformation of raw scores into standard 
Scores is strictly linear, and the correlation of standard scores with the 
original raw scores is exactly 1.00, except for the effect of rounding error. 

Normalized scores resemble standard scores and are theoretically 
identical with them when the original distribution of raw scores is normal. 
However, there is a correction for any departure from normality in the 
original raw scores. Accordingly, the distribution of the normalized 


Scores is more or less normal, irrespective of the shape of the original 
distribution. 


PERCENTILES AND PERCENTILE RANKS 


As described in Chapter 4, the two types of percentile transformations are 
percentiles proper and percentile ranks. With distributions of extremely 
large N and fine score gradations, percentiles and percentile ranks theoreti- 
cally coincide. In computing a percentile, we find the theoretical score 
below which lies a certain percentage of the distribution. This theoretical 
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score is often fractional, even though the test is scored in discrete units. 
In computing percentile ranks, we start with integral scores and determine 
the theoretical percentage of the distribution that lies below that score. 
The formula for any particular percentile rank requires that the number of 
cases below the score, plus half the number of cases at the score, be 
divided by the total number of cases in the distribution. 

Percentile ranks are readily computed from a frequency distribution, 


as in Example 4.3. 
Occasionally, test norms are reported in terms of score equivalents for 


selected percentiles. Often the percentiles shown аге P,, Ps, P,o, and so on, 
at intervals of five percentile points up to Ро; and Pog. An example of this 
type of norming is shown in Example 4.2 together with the computational 


Steps. 


STANDARD SCORES 


The simplest type of standard score is the familiar z score, which is merely 
the number of standard deviations a score is above or below the mean 
of all the scores in its series. As presented in Formula 5.9, the z score is 


Х-М, 
Sx 


x 

т„=- = 
Sx 

and the general formula with any assigned mean (M^) and any assigned 

Standard deviation (s’) is (Formula 5.10) 


Х- М, 


Sx 


5.5. = ( ІЗ m 

For norming tests, the most popular standard score system uses an 
arbitrary mean of 50 and arbitrary standard deviation of 10. When used 
with a carefully defined base population, such as 12-year olds, this becomes 
the T score; when used for achievement tests with a population who have 
completed a year's study of a subject, it becomes a scaled score. 

Hull (1) advocated a standard score with M' of 50 and s' of 14 so as to 
attain as wide a spread as possible (+ 3.55) and still have two-digit scores. 
Wechsler (3) uses an M' of 10 and an s’ of 3 for intelligence test subscores. 

An unlimited variety of standard score systems 1s possible. Irrespective 
of the arbitrary mean and standard deviation: 

1. Scores in different variables are reduced to a common metric; 
2. Scores are easily interpreted by reference to the assigned mean and 


Standard deviation; and t : ! 
3. АП differences in standard scores are directly proportional to differences 


in the original raw-score units. 
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Since with most psychological tests, the obtained mean and standard 
deviation are functions of the particular testing material that happens to 
be employed, the assigned mean and standard deviation are generally 
just as appropriate as the original values. Standard score norming is 
shown in Example 5.2. 

The percentile system also reduces scores to a common metric, but 
differences in percentiles are not proportional to differences in original 
scores. Individuals who use tests in personnel selection, in counseling, and 
in clinical work find the percentile easy to interpret to others. The rec- 
tangular distribution of the percentile is probably less desirable than the 
distribution of standard scores, which is essentially unmodified from the 
distribution of the original observations. However, both systems seem 
likely to continue side by side. 


NORMALIZED SCORES 


Normalized scores are much like standard scores in that the mean and 
standard deviation are predetermined. However, conversion is effected in 
such a way that the normalized scores for the group used in establishing 
the norms yield a distribution that is approximately normal. The method 


has been treated in Chapter 11, where Example 11.4 demonstrates required 
computations. 


SUMMARY 


One of the principal areas for the application of psychological statistics 
is with measures of intelligence, aptitudes, achievement, interests, and 
personality traits. In a homogeneous test, items measure overlapping 
aspects of the same general characteristic; in a heterogeneous test, the 
items have relatively low intercorrelations, but normally have positive 
relationships with an outside criterion. Statistical concepts pertinent to 
total variables, including the mean, variance, and correlation, are applicable 
to items. 

Much of the theory of psychological tests centers around the concept 
of random error, which contributes to the total variance, but which by 
definition is unrelated to true variance and to the random error in other 
tests. The reliability of a test is the proportion of true variance and may be 
estimated by test-retest; by the correlation of alternate forms; by corre- 
lating one half of a test against the other half and then deducing the 
reliability of the whole; and by the method of rational equivalence, which 
correlates an existing test with its theoretical equivalent. The correlation 
of a test with an outside variable is systematically affected by changes in 
reliability, changes which may be caused by variation in length. The 
standard error of measurement, the estimated standard deviation of errors 
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made in predicting true scores when we have knowledge only of the 
obtained scores, affords a practical interpretation of reliability. 

In building and revising a test, item analysis is an important tool. 
Difficulty or popularity analysis indicates which items may be dropped or 
modified because they contribute inadequately to the total variance. 
Various forms of internal consistency analysis are used in building a homo- 
geneous test, while item analysis against an external criterion may yield an 
instrument with high predictive validity. 

Another application of statistics is in establishing norms in terms of 
percentiles, standard scores, or normalized scores, permitting the com- 
Parison of the standing of the same individual on different tests or the 


comparison of an individual with a group of his peers. 


EXERCISES 


1. Using the item score matrix of Example 15.1, correlate the score on the odd- 
numbered items with the score on the even numbered items and apply the 


Spearman-Brown prophecy formula to estimate the reliability of the test as 
a whole. Compare with rzz as estimated by KR Formula 20. 


test. Find rzz by the "'split-half ” 


2. Below are the scores, 1 or 0, of 25 individuals on the 20 items of a mechanical comprehension 


technique and by KR Formula 20. 


ITEM 


17 18 19 20 


16 


13 14 15 


12 


INDIVIDUAL 


ОО On On 


€*monmumunozrx 


= 


MAZZOROMnH DD мх 
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3. Below is the variance-covariance matrix of ten items on a test of aircraft 
information (N = 1000). Find rzz by KR Formula 20. (A method for finding 
Vz is given in Chapter 9.) 


ITEM / 2 3 4 5 6 7 8 9 10 


1 217 1057 .023 .029 .047 .004 .014 .005 .058 .035 
2 246 .039 .048 .069 .004 .014..015 .074 .054 
3 155 .040 .042 .007 .021 .018 .045 .029 
4 164 052 .007 .017 .015 .061 .034 
5 236 .005 .016 .021 .088 .041 
6 .040 .007 .004 .012 .006 
1 .069 .009 .024 .009 
8 .069 `.018 .007 
9 232 .055 
10 165 


4. A test, X, with a reliability of .85, is doubled in length to become X", while У, 
with a reliability of .75, is tripled in length to become Y". If rz, = .40, what 
is the best estimate of гу, on the assumption that the new material in each 


test is homogeneous with the old? 

5. The reliability of X, with S.D. of 5, is estimated to be .56. If the test is doubled 
in length (by adding new material with S.D. of 5, and with a correlation of 
-56 with the original material) what would be the new Smeas. ? 

6. 1f the reliability of a test is .70 in a group in which the S.D. is 9, what would 
be the expected reliability of the test in a group in which the S.D. is 10? 

7. Prove that the correlation between a total score, X, and its error component, 
ez, is VI паг 

8. Demonstrate that if the p values of all items are identical, KR Formula 20 
can be written as 


n(Vz — Mz) + Mz? 
та е, DYs 
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MATRICES AND 
DETERMINANTS 

IN PSYCHOLOGICAL 
STATISTICS 


16 


NATURE OF MATRICES 
In describing correlations involving many variables, the notation of 
matrices and determinants is concise and informative. Extensive groupings 
of data may be indicated by single letters, and processes requiring hundreds 
or thousands of multiplications and additions may be denoted by two or 
three letters and a few symbols. Matrix formulas and equations can be 
used to describe conventional statistical operations, and in some cases 
they point the way to advanced methods of analysis. | | 
Actually, all the calculations of descriptive statistics can be described in 
words as series of numerical operations. More economically, however, 
these computations may be summarized in the notation of conventional 
Scalar algebra, as in earlier chapters. When three or more variables are 
Concerned, these same operations can generally be represented still more 
Succinctly in matrix algebra, which is a further development of the scalar 


Notation. 


MATRIX CONCEPTS 
Among the concepts often encountered in discussions of psychological 
data are types of matrices, the transpose of a matrix (an alternate way of 
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writing a matrix in which columns take the place of rows, and vice versa), 
matrix addition and multiplication, and the inverse of a matrix, which 
corresponds in a general way to the reciprocal of conventional notation. 

Certain matrices can be evaluated as determinants, which have further 
interesting properties. The numerical procedures used to find the partial 
variances and covariances pertinent to multivariate correlation may be 
conveniently expressed in determinantal terms. 


ROWS AND COLUMNS OF A MATRIX 


While any rectangular arrangement of numbers may be called a matrix, 
in psychological statistics a matrix is generally a systematic arrangement of 
numerical information in which rows and columns have assigned meanings. 
For example, Table 16.1 is a roster or matrix of test scores in which the 
rows represent individuals and the columns represent tests or variables. 


TABLE 16.1. A HYPOTHETICAL ROSTER OF SCORES 


а) (2) Q) (4) 

GENERAL MECHANICAL ARITHMETIC SPATIAL 

INTELLIGENCE COMPREHENSION REASONING RELATIONS 
Arthur 121 46 26 50 
Benjamin 114 49 20 58 
Charles 114 40 24 58 
David 107 34 22 62 
Eugene 100 43 24 42 
Frederick 93 46 16 54 
George 86 37 18 42 
Непгу 86 34 14 38 
Irving 79 31 16 46 


Full identification of the rows and columns can be omitted if the 
assigned meanings are well understood. In all work with matrices it is 
conventional to refer first to the row or rows and secondly to the column 
or columns. In Table 16.1, each separate number can be considered an 
element of the matrix, denoted by a letter such as a, and with subscripts to 
indicate the row and column in which the element belongs. If the rows 
are denoted as a, b --- i and the columns as 1,2 --- 4, then the score of David 
on the arithmetic reasoning test is аџз, or 22. In general terms, a; refers 
to the element in the ith row and jth column of matrix A. 

. Matrices are often considered to be divided by horizontal and vertical 
lines into sections of one element each. These sections may be thought of as 
cells and the elements within them as cell entries. 


ORDER OF A MATRIX 


The order of a matrix is merely the number of its rows and the number of 
its columns, connected by a multiplication sign. The order of the matrix 
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in Table 16.1 is 9 x 4, 9 being the number of rows and 4 being the number 
of columns. In general terms, it is conventional to use m for the number 
of rows and n for the number of columns. If the matrix is square, with the 
same number of rows and columns, m =n, and the order is designated 
merely as л. 

Elements within a matrix may also have “order” in a sense different 
from “order” as applied to the matrix as a whole. In a matrix of partial 
variances and covariances, for example, the “order” of the coefficients 
refers to the number of variables that have been partialed out in forming 
them. In a single matrix, then, the term order may be used with two 
entirely different meanings; in one sense, order refers to the number of 
rows and columns; and in the other sense, order refers to a characteristic 


of the elements. 


VECTORS 
The generic name for a row or column (without specifying which) is 
vector. More specifically, a vector is conceived to bea 1 x n or an n X 1 
matrix, existing independently or as a part (or submatrix) of a larger 
matrix. 

Vectors are often designated by lower-case letters in heavy type, and 
the elements of a row vector are generally enclosed in parentheses. Thus 
the row vector of Charles’ scores (four elements) from Table 16.1 would 


be written 
c = (114, 40, 24, 58) 
ertically with the marks (yet to be 


s or, for convenience, they may be 
brackets. Thus the column vector 


Column vectors may be written v 
described) used to distinguish matrice 


written horizontally and enclosed in ‹ 1 1 
ог the scores оп the mechanical comprehension test in Table 16.1 consists 


of nine elements, one for each of the nine individuals. If it is designated 
хз, the subscript indicating that it is the column assigned to the second 


variable, it may be written 


x; = (46, 49, 40, 34, 43, 46, 37, 34, 31} 


SCALARS 
carried down to the limiting case, we have a 


matrix with a single row and a single column. Such a 1 x 1 matrix is 
called a scalar and it may be considered as a chief connecting link between 
matrix algebra and conventional, or scalar, algebra. A scalar in matrix 
algebra has identical properties in all operations, as does any quantity (in 
literal or numeric form) in conventional algebra. 


When the logic of matrices is 
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DESIGNATION OF COMPLETE MATRICES 


A complete matrix is indicated by a capital letter in heavy type, such as 
A, S, or X, or by the typical element in conventional lower case enclosed 
in parentheses, such as (a;;) or (ха). When the numbers or letters of a 
matrix are written out in full, they are enclosed in double lines, or brackets, 
or large parentheses, or single lines curved at the ends. All are recognized 
as good usage. Here we elect to use double lines, which contrast neatly 
with the single lines used to designate the determinants to be described 
later. 

The scores of Table 16.1 can be converted to deviations by subtracting 
from each value the mean of the variable. (Later this familiar procedure 
will be stated as a series of matrix operations.) 

Thus, from each value in vector 1, the general intelligence test, we 
Subtract its mean, 100, and from each value in vector 2, we subtract the 


mean, 40. and so on. The resultant matrix is designated as X and is shown 
in Table 16.2. 


TABLE 16.2. SCORES OF TABLE 16.1 AS DEVIATIONS FROM THEIR RESPECTIVE MEANS 


2 6 6 0 

14 9 о 8 

14 0 4 8 

7-6 2 12 

X= 0 3 Ж 8 
-7 6-4 4 

—14 -3 —2 -8 

-14 -6 -6 -12 

-21 —9 -4 -4 


All the information of Table 16.1 is represented in X, provided we re- 
member the meanings of the rows and columns and the means of the four 
variables. 

The standard deviations of the four variables in X are respectively 14, 6, 
4, and 8. If each deviation is divided by the standard deviation of the 


variable, the result is Z, the matrix of the four variables in z form, shown 
in Table 16.3. 


TABLE 16.3. INFORMATION OF TABLE 16.1 AND TABLE 16.2 IN z FORM 


1.50 1.00 1.50 .00 
1.00 1.50 00 1.00 

1.00 00 1.00 1.00 

50 -1.00 50 1.50 

Z= 00 .50 1.00 —1.00 
—.50 1.00 —1.00 .50 

—1.00 —.50 =;50 —1.00 
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SOME TYPES OF MATRICES 


Terms useful in describing special types of matrices of interest in statistical 
operations include the zero, the symmetric, the triangular, the diagonal, 
and the identity matrices. In general, these matrix formats do not fit 
original data or even the results of statistical analyses; rather they are 
useful forms in connection with matrix computations and manipulations 
later to be described. All the terms describe regularities of internal matrix 
structure, as contrasted with the matrix in Table 16.1 where the internal 
structure is completely free, since no element is necessarily identical with 
any other element. 

If a matrix is composed of elements all of which have the value of zero, 
it is a zero (or null) matrix. A zero matrix may be rectangular or square; 
whereas the symmetric, triangular, diagonal, and identity matrices are 
necessarily square. 

In a symmetric matrix every element below the main diagonal (the cells 
that constitute a chain from the upper left to the lower right) has an 
element of exactly the same value in a predetermined position above the 
main diagonal. That is to say that a;; (the element in the ith row and jth 
column) is identical with ау (the element in the jth row and ith column). 
If i does not equal j, a;; is a general expression for off-diagonal elements, 
all of which appear in the matrix as pairs. Examples of symmetric matrices 
are complete matrices of variances and covariances and complete matrices 
of correlation coefficients. 

If all elements below or above the main diagonal are zeros, it is a tri- 
angular matrix. If all elements both below and above the diagonal are 
zeros, but the elements in the diagonal have value, it is a diagonal matrix. 
A matrix with the same number in all the cells of the main diagonal but 
with 0 in all other cells is a scalar matrix. A scalar matrix with 1’s in the 
diagonal is an identity or unity matrix. Numerical examples of these 


matrices are given in Table 16.4. 


TABLE 16.4. EXAMPLES OF SPECIAL TYPES OF MATRICES USED IN 
MATRIX OPERATIONS 


100 48 .32 416 
0 0 48 100 24 .12 1.00 .50 .40 
00 32 24 1.00 .08 (00 .75 .30 
0 0 16 42 .08 1.00 (0 .00 .72 
Тего Symmetric Matrix Triangular Matrix 
Matrix 
10714 .0000 .0000 .0000 5000 1000 
0000 .1667 .0000 .0000 0500 0100 
0000 .0000 .2500 .0000 0050 0010 
0000 .0000 .0000 .1250 0005 0 001 
Scalar Matrix Identity Matrix 


Diagonal Matrix 
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MATRIX TRANSPOSITION 


Any matrix can be rewritten so that the elements of the rows of the 
original matrix become the elements of the columns of a new matrix, 
called the transpose. A transpose is indicated by a prime, or accent mark, 
or by a superscript Т. Thus A’ or АТ is the transpose of A. If S is the score 
matrix of Table 16.1, then the transpose is 


121 114 114 107 100 93 86 86 79 
46 49 40 34 43 46 37 34 31 
26 20 24 22 24 16 18 14 16 
50 58 58 62 42 54 42 38 46 


57- 


By transposition the rows of A become the columns of A’, and vice 
versa. Another way of describing transposition is to say that element a; 
of the original becomes element аң of the transpose. 

The transpose of the transpose is, of course, the original matrix, that is, 
(A7)? is A. If a matrix is rectangular, we often think of the form with the 
smaller number of columns as the original and the form with the larger 
number of columns as the transpose; but actually, the designation as to 
Which is which is more or less arbitrary. Any nonsymmetric matrix can be 
Written in two forms, one of which is the transpose of the other; and in 
many matrix operations, transposition is vitally important. When rows 
and columns are interchanged, properties of the matrix may be changed. 
However, if a matrix is symmetrical, it is identical with its transpose. 


MATRIX EQUATIONS 


Table 16.1 may appear to be simply a display of information, but matrices 
1n general are not inert. They enter into equations; they may be modified 
by other matrices, including scalars and vectors; and they may act upon 
or modify still other matrices. When a matrix equation is solved, values for 
a number of unknowns may be obtained more or less simultaneously, since 
in some cases a matrix equation is actually a set of n simultaneous linear 
equations іп п unknowns, n x n and n x | being the order of the matrices 
involved. 

A matrix equation as a whole is handled analogously to an equation in 
ordinary algebra. An expression on one side may be simplified or other- 
Wise modified by operations internal to that side, without affecting the 
other side of the equation. However, if the expression on one side of 
the equation is changed by adding, subtracting, or multiplying by an 
outside factor, then both Sides must be treated in the same fashion. 
An advantage of matrix equations is that relationships which may be 
obscure when all ideas are expressed in conventional algebra may become 
conspicuous in matrix notation. To know how to handle matrix 
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equations, it is necessary to understand fundamental operations, some of 
Which are exactly parallel to those of scalar algebra. Other operations, 


however, involve new principles. 
SOME OPERATIONS IN MATRIX ALGEBRA 


EQUALITY OF MATRICES 
Two matrices are equal if and only if they are of the same order and if 
each element in one matrix equals precisely the corresponding element in 
the other. If а; is the typical element in А and bj; is the typical element in 
B, then a;; must equal b;; for all values of i and j. This principle permits 
a matrix equation involving simple simultaneous linear equations to be 
reduced to a vector of letters on one side of the equation and a vector of 


numerical values on the other. 


ADDITION AND SUBTRACTION 

Matrices can be added together if and only if they are of exactly the same 
order. Corresponding elements are merely summed. The order in which 
the matrices to be summed appear in an equation is immaterial; and if 
Several matrices are summed, the order in which they are added is of no 
consequence. Exactly the same principles apply to matrix subtraction. An 
algebraic example, involving both subtraction and addition, is given 


below: 

411 414 bir bio Cir C12 ayy — bii +611 012 7 Pia + 612 
421 43; by, b22 + |21 с _ a21 — b21 + C21 a22 — b22 + C22 
азі 432 bs, bs; Сз C32 азі — b31 + C31 432 — b32 + Сз; 
441 адз bar ba; Cay Саз a4; — bay + Саа 042 — b42 + C42 


MATRIX MULTIPLICATION 

Two matrices may be multiplied together, provided they can be arranged 
in an order such that the number of elements in the rows of the first 
matrix equals the number of elements in the columns of the second. This 
Statement implies two important restrictions on matrix multiplication: 


1. Only certain matrices may be multiplied together; and 
2. The sequence in which the two matrices appear may affect the results of 
multiplication. In general, in terms of matrices, AB does not equal BA. 


The computation of a variance affords a simple example of matrix 
multiplication. In scalar notation, 


xx? 
kewi (5.3) 


that is, the variance is the mean of the squares of the deviations from the 


426 AN INTRODUCTION TO PSYCHOLOGICAL STATISTICS 


arithmetic mean. These deviations may be written as a vector X, that is, 


of which the transpose (obtained by reversing rows and the single column) 
is 


ХТ = |, x4 x; -- х, 


We now arrange the two matrices in the order ХТХ and multiply: 


XX= Із X2X377 ж 


It is to be noted that the number of elements in the row vector X? is N, 
exactly the same as the number of elements in the column vector X. Each 
term in XT is multiplied by the corresponding term in X, and the products 
are summed as a scalar, Хх2, It should be pointed out that X7 is post- 
multiplied by X, that is, the matrices appear in the sequence X?X. In this 
instance, as in instances to be pointed out later, a different sequence of the 
matrices (that is, XXT, or X" premultiplied by X) would produce a radically 
different result. 

The product of XTX, Ex?, now becomes the variance when divided by 
another scalar, N, but that is not a matrix operation. Instead of reverting 
to scalar algebra, two new principles are noted: 


1. When a matrix is multiplied by a scalar, all the elements of that matrix 
are multiplied by the scalar (and in this case, there is no difference 
between premultiplication and post-multiplication); and 

2. When three or more matrices are multiplied together, the temporal 


Sequence in which the multiplications take place is completely im- 
material. 


In the general case, with matrices represented as A, B, and C, and with 


parentheses indicating the two matrices first multiplied together, the second 
principle can be expressed as 


ABC = (AB)C = A(BC) 
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In the specific instance, with 1/N as a scalar, it can be said that 


(px) - ano xa 


Choosing the [(1/N)X"]X route to the variance, we multiply X7 by 
1/N and post-multiply the resultant row vector by the original column 


vector X. Then 


X1 
X2 
x4 %_%X3 xalal] > 
хі Ха Хз Ха Мы. 16. 
NNN N N 182) 
Xn 


Numerical operations as indicated in matrix notation are not necessarily 
the most efficient computationally. The operations indicated in Formula 
5.3 are actually more efficient for finding the variance than the operations 
indicated in Eq. 16.2, while the conventional raw-score formula 

zx? 


= eM? 5.3а 
Қыз E (5.3a) 


is more efficient than either. We have used matrix notation in dealing witha 
single variable only to illustrate the notation itself. Concepts and pro- 
cedures that seem slightly awkward in relatively simple instances become 
very appropriate when the same form of operation is extended to a large 
number of variables. 

All operations in the multiplication of larger matrices follow precisely 
the model of Eq. 16.1; that is, the row elements of the first matrix are 
multiplied by the column elements of the second matrix, and the products 
are summed. For this reason, the number of elements in the rows of the 
first matrix must be identical with the number of elements in the columns 


of the second matrix. 


In matrix multiplication, every row vector in the first matrix must be 


post-multiplied by every column vector in the second, and the resultant 
scalar (the sum of “inner products” of pairs of elements) becomes the 
element in the ith row and jth column of the product matrix. Thus each 
cell entry, a;;, in the product matrix is formed by multiplying together 
two vectors, any row i in the first matrix and any column / in the second. 

If the order of the first matrix is m х п and that of the second л x p, the 
order of the product matrix will be m x p. Since the sum of a series of 
multiplications is needed for each element in the product matrix, the total 


number of multiplications required is n x т X P. 
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A MATRIX OF 7S AS A PRODUCT OF TWO MATRICES 


As an example of matrix multiplication, consider the development of R, 
a matrix of correlations with 1.00 in the diagonal (that is, a matrix of 
covariances and variances in z form). 

If we transpose Z (the matrix shown in Table 16.3), multiply the trans- 
pose by the scalar 1/N, and then post-multiply (1/N)Z* by Z, the elements 
in the diagonal of the product matrix will be z-score covariances or 
correlation coefficients. The formula in matrix notation for a correlation 
matrix of any size is 


x Z'Z-R (16.3) 


In Table 16.5 the information of Table 16.3 is displayed algebraically 
and numerically in the format required by Formula 16.3. Letter subscripts 
indicate individuals ; numerical subscripts indicate variables. In accordance 
With customary usage, the subscripts of the elements of the transpose 
retain their original sequence, and hence refer first to column and then 
to row. 

The computation of two of the elements in the product matrix can 
be shown analytically. The first row of the first matrix is post-multiplied 
by the first column of the second matrix, yielding 


Zai 2ы Za 241 el 2 
N ғарғы Rab ea + а 2 
2 2 2 2242 Dz. 


A correlation coefficient, r;>, is formed by post-multiplying the first 
Tow of the first matrix by the second column of the second matrix: 


Zai Zpi Ze 2а1 Ze 
М 2 + 22 + TN at Zea + ту 272 
2 2 2; %212 
91 h1 Hr 123. — Ж. 
+ N 262 + N 22 м 22 N =.Сџ = 


From Table 16.5 it may be noted that, when the matrix is premultiplied 
by its transpose, the product matrix is symmetrical. (In this case the 
multiplication by the scalar is immaterial.) It is always true that the product 
matrix of any matrix and its transpose (either by pre- or post-multipli- 
cation) is symmetrical. Furthermore, a matrix and its transpose can 
always be multiplied together, since the number of elements in the rows 
of the first will equal the number of elements in the columns of the second. 


05- 001- 051— osi 
O€I— OSI— 001- 001- 6. - 26 6 6 6 6 6 _6 0 
00I— 0€— os— 001- Os 01 001 0 001. OST 001 001 
RS ХЕ 6 6 6 6 6 6 6 
05 00I— 001 05 = ы 8 205 сө сы 
OUT OST 0 001 OOF 05 001 os'T 
001- 001 05 00 
Е : Бб бн бё, 2650 26 6-2. е AGE 
01 05 00-05 ost 0017 os oof os oof 9 ост 001 
ЖЕ МЕ 00 COL din el de Hog. ES cR cR LAE 
001: 00 051 001 Ost 001 001 бс 09 001 001 ост 
00 051 001 ост 
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222% 12%2% viz. uri Ap WE 2 wz voz Wiz Vez vPz Voz vaz Voz 
N N Mz ей blz uz A GN. Му Ne NS OND NS NE GNE 
22827% 12829 fiz ЄЧ2 8027 t/z бәр tPz gz gaz 692 
=|} 92 tz 92 7192 
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If A is a m x n matrix, A’ will be of order n x т. The product ААТ will 
be of order m. On the other hand, the product АТА will be of order л. In 
the case of the original matrix of psychological test scores, there are n 
columns representing tests. Consequently, when a transpose of one form 
of the score matrix is post-multiplied by the matrix, the result is a sym- 
metrical matrix of order n, representing the intercorrelations. 

Let C be a variance-covariance matrix of three variables in deviation 
form and let À,j be a diagonal matrix in which the elements are the 
reciprocals of the three standard deviations. Then, when A,,, premultiplies 
C, the result will be 


1 Wi С G 

= 0 0 Ко, 246 249 

51 V, С; Сіз Sy Sy Sy 
1 

0 — 0|-|1с, У, с, = би Ж» Саз 
52 32 52 52 

б o llle, с. Ф| |б Са Vs 


a 


3 53 53 33 


When C is post-multiplied by Àj, the elements in each column of C are 


multiplied by the corresponding nonzero column elements of ET 
follows: 


V, Ci; Сіз 

Же С 0 0 —_ — — 
ы Me S ds Si © & 

Üs Y Callo + of = C; V2 C23 
32 51 52 33 

Сы Сы Vail lo o Hi (б €: Ys 
5% 5 б $3 


It follows that when a variance-covariance matrix obtained from 
deviation Scores is premultiplied and post-multiplied by a diagonal 
matrix of reciprocals of standard deviations, the result is a variance- 
covariance matrix in z form; that is to say, a matrix of correlation co- 
efficients with unity in the diagonal. In matrix notation, 


АСА =R (16.4) 


| Since the sequence in which matrix multiplications are performed is 
inconsequential, we can first premultiply C by A,,,, and then post-multiply 
the resulting product by Ауыз ог we may start by post-multiplying C by 


А» and then premultiplying the result by 21/5. In terms of scalars within 
the matrices: 
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1 1 
= о Ol |v, с, cs |- o of | © €» 
51 1 5151 515) 5153 
1 1 с, V. 
о = o|:|Cc4 V. G-lo = of |62: 2 Са 
$5 8; 5251 5252 5253 
1 ба € 
оо H |с, c, v fo o H |С: €: в 
53 53 5351 5352 5353 
1.00 rj, fig 
-|һі 100 rj 


T31 132 1.00 


We now have a way of indicating how to find R, the symmetrical matrix 
of correlation coefficients, from X, the matrix of deviation scores displayed 
in Table 16.2. By analogy with Eq. 16.3, C can be found from X as 
follows: 


x ХІХ = С (16.3а) 


Substituting (1/N)X7X for С in Eq. 16.4 and moving the scalar 1/N to 
the left yield 


3 ДАТА, =R (16.3b) 


If we wish, we can work back to S, the original matrix of raw scores 
displayed in Table 16.1. To find X, the matrix of deviations from the 
respective means of the variables, we need to subtract M (a matrix of 
means) from X. M can be found by premultiplying S by U, an m x m 
matrix, all the elements of which are 1/N. Thus, 


S - U " S = S E M = x 
Жой. Xa X, M, M c 
Xa X N N М Xa Xa2 al а2 1 2 Хаі Хаз 
X ird -|Хы Хы|-|Мі M 
ы Хы|- NNN Хы Хы|-|Хы Хы 1 21| = |Xb1 Х,2 
id Xa X M 
Хы Xa NNN Xa Хо ab Xa 1 M, ur Эде 
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It is now known that 
X=S-US 


Accordingly, the formula for R becomes 
1 
N 2.5(5 – US)'(S — US); =R (16.3c) 


THE IDENTITY (OR UNIT) MATRIX IN MATRIX EQUATIONS 


An important device for use in matrix equations is a square, symmetrical 
matrix consisting of ones in the main diagonal and zeros in all other cells. 
Generally, it is called the identity matrix, although unit matrix is an 
alternate designation. Its symbol is I, and its function resembles that of the 
number 1 in scalar algebra in that when any matrix A is premultiplied or 


post-multiplied by an identity matrix I, the product matrix is A. That is to 
say, 


IA=AI=A (16.5) 


If A is of order m x n, a premultiplying identity matrix must be of 
order m, while an identity matrix used as a post-multiplier must be of 
order л. 


When the usual rules for matrix multiplication are followed, it can be 
Seen that Eq. 16.5 holds: 


I A = A 
1 0 0 411 412 411 404,2 
010 а, 4: |-| 4:1 422 
001 азі аз: аз 032 

Similarly 

A x I = A 
411 ауу 10 411 12 
421 43: | 0 1 | азі 43: 
азі 032? азу 432 


In a matrix equation, an I may be introduced anywhere as a multiplier. 
Let one side of an equation begin with matrix A and let I of the same order 
be introduced as a premultiplier on the other side. Both A and I can be 
modified if both are premultiplied by a matrix of proper order and of a 
nature appropriate to our purpose. Such a matrix would have the effect of 
multiplying (or dividing) all the elements of one or more corresponding 
rows of A and I by a constant. Another type of permissible operation 
(because it can also be written as matrix premultiplication) is to add to, or 
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subtract from, corresponding rows of A and I any multiple of a corres- 
ponding row of the matrix that is being modified. 

Since both matrices are being modified by premultipliers, operations are 
limited to rows. (If A and I were established so that they were being 
modified by post-multiplication, operations would be entirely by columns.) 
Through a series of such operations on the two sides of the equation, both 
A and I can be radically modified. A procedure that is often useful is to 
modify A by operations described above until A becomes exactly equal to 
the original I, and hence can be dropped out of the equation. By the same 
Operations, what was the original I becomes a matrix of interesting 
Properties: It is the inverse (or reciprocal matrix) of A, denoted as A ^!. 


THE INVERSE 


Only a square matrix can have an inverse, and not all square matrices 
qualify. If a square matrix has no inverse, it is called singular; if it has an 
inverse, it is called regular. Later, when determinants are considered, a 
Procedure will be noted for discovering whether or not a matrix has an 
inverse, 

Let A bea regular, square matrix. Then А”, its inverse, may be defined 
as a matrix of the same order, which when used either as a premultiplier 
Or post-multiplier of A, yields the identity matrix I as the product. Thus, 
by definition, 

AIA = АА! = І (16.5a) 

The identity matrix is square and regular, and its inverse is the identity 
Matrix itself; that is, 

=r! 
as can be demonstrated by multiplying any I by itself, yielding I. 

By a few steps in matrix algebra it can be proved that if A has an inverse, 
then the inverse, A^, is unique. 

Let us assume for a moment that B is a matrix different from A^! but 
also an inverse of A, so that 


BA = AB=I (16.6) 
It follows from Eq. 16.5 that 
B = BI = B(AA ^!) (16.7) 


This last multiplication can also be accomplished as (ВА)А”!. In Eq. 16.6, 
however, it has been stated that BA — I; therefore, Eq. 16.7 becomes 


B = BI = В(АА-!) -IA^! = A7! 


and B, supposedly an inverse different from A^! is really A^!. Therefore 
it has been shown that the inverse of a regular matrix is unique. 
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An important property of the inverse has been hinted at in the operations 
on A and I on different sides of an equation, reducing A to I and modifying 
Ito At. 

Whenever a first premultiplying (or final post-multiplying) matrix is 
transferred from one side of a matrix equation to another, it appears as an 
inverse and as a first premultiplier (or final post-multiplier) on the second 
side. 

Thus, if AB = C, then B = A^! C or A = CB“. If two ог more matrices 
in multiplicative arrangement making up the side of an equation are 
transferred to the other side, they appear as inverses, but in inverse order. 
Thus, if ABC = D, then D! = C^!B^! A^, 


OTHER PRINCIPLES IN MATRIX EQUATIONS 


As described earlier, the transpose consists of exactly the same elements 
as the original matrix, but the elements of the rows of the original are 
written as the elements of the columns of the transpose, and vice versa. 

As applied to inverses, the transpose of an inverse is the inverse of the 
transpose. That is, (А!) = (A7)^!, 

A rule on transposes is often useful in manipulating matrix equations. 
The transpose of the product of any set of matrices is the product of their 
transposes, but in reverse order. Thus, 


(AB)' = ВТАТ and (ABC)? = СТВТАТ 


Another Principle is that the product of two matrices may be a zero 
matrix, 0, even if neither matrix is a zero matrix. Consider the matrix 
multiplication below, in which neither A nor B is a zero matrix: 

= 0 


A B 
1 OF |0 Oj] 
1 Of] M at 
However, if any matrix is multiplied by a zero matrix, the product is a 


zero matrix, which is cognate with the principle of multiplication by 0 
in scalar algebra. Thus, 


0 0 
0 0 


0-А-А:0-0 
FINDING BETAS FROM А MATRIX EQUATION 
In Chapter 7 it was noted that the n final beta weights in multiple re- 
gression are found by solving n simultaneous linear equations. These are 


so-called normal equations. In a set of these equations, the correlations 
(or z score covariances) are known, while the betas are unknown. 
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With three predictors, these equations can be written as 


V3Bo3.12 + C32Bo2.13 + C31B 01.23 = C30 (16.8) 
C23Bo3.12 + V2Bo2.13 + C21Bo1.23 = C20 (16.9) 
C138 03.12 + C12Bo2.13 + Vibo1.23 = Cio (16.10) 


Each variance (equal to 1.00) is represented as V;. The covariances, 
numerically equal to the r's, are symmetrical; that is, Ci; = Cji 


ESTABLISHING A MATRIX EQUATION 


A matrix often indicates an arrangement of coefficients detached from a 
set of equations. The known numerical coefficients can be detached from 
Eqs. 16.8, 16.9, and 16.10, and the resultant matrix is denoted as R. 
The vector of betas can be indicated as p, while the column vector of vali- 
dities or z-score covariances between the three predictors and the criterion 
variable, 0, is C,. The matrix equation in expanded form and in concise 
matrix notation becomes 


V; Сз Сз Воз.12 Сзо 
C; V; Cay ||| Boras | = C20 
С, Со М Во1.23 Cio 
or 
RD = С, (16.11) 
Premultiplication of both sides of Eq. 16.11 by R^! yields 
R-'RB=R7'C, (16.11a) 
or 
ІВ = В = ЕС, (16.115) 


In words, the column vector of beta coefficients can be found by pre- 
multiplying the column vector of validity coefficients by the inverse of the 
Correlation matrix. 

In more general notation, if 


AB=C (16.12) 


then, А-:АВ = A~!C, and B = AC. 

In Eq. 16.12, A might be a matrix of correlation coefficients, B a matrix 
of final order betas for predicting different criteria, and C the matrix of 
covariances between all the predictors and the п criteria. Thus, if there are 
a number of criteria, the betas for predicting each criterion can be computed 
by premultiplying the matrix of validities by the inverse of the matrix of 
Predictors. 
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Matrix methods provide a systematic guide to the solution of simul- 
taneous equations, but any operation that can be performed with matrices 
can in general be performed analytically. Sometimes analytic and matrix 
procedures are parallel or even numerically identical, but often the 
variants of the analytic methods are the more numerous. 


SOLUTION OF SIMULTANEOUS LINEAR EQUATIONS 


The three equations below have correlation coefficients as the knowns 
and the three final betas for predicting variable 0 as the unknowns. 
They are solved below by a conventional procedure of eliminating one 
unknown at a time from the system, first fo, 12, and then fo; 1з. After an 
equation has been established with Bo1.23 as the single unknown, the betas 
. that have been eliminated from the system (ßo3.12 and fo, 43) are found 
from a “back solution." 
With numerical values representing a set of correlation coefficients and 
with variances of unity, Eqs. 16.8 to 16.10 become 


Bo3.12 + -60802.13 + -50Bo1.23 = -30 (16.8a) 
-60Bo3.12 + Bor.13 + -70801.23 = .50 (16.9a) 
-5003.12 + -70Bo2.13 + Bos.23 = -60 (16.10a) 


Multiplying Eq. 16.8a by .60: 
-60Bo3.12 + 360213 + -30801.23 = .18 (16.8b) 
Subtracting Eq. 16.8b from Eq. 16.9a: 
64B 02.13 + 400123 = .32 (16.13) 
Multiplying Eq. 16.8a by .50: 
-50Bo3.12 + 3000213 + -25B01.23 = .15 (16.8c) 
Subtracting Eq. 16.8c from Eq. 16.10a: 
A0Bo2.13 + -75Bor.23 = .45 (16.14) 
Multiplying Eq. 16.13 by .625: 
:40flo; 1з + -25Bo1.23 = .20 (16.13a) 
Subtracting Eq. 16.13a from Eq. 16.14: 


-50Bo1.23 = .25 (16.15) 
and 


Bo1.23 = -50 (16.15a) 
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For the back solution, .50 is substituted for Boiss in Eq. 16.13a 
yielding | 


40065113 + (.25 x .50) = .20 (16.13b) 
and 
Bo2.13 = -1875 (16.13c) 
Substitution of .1875 for fig; 1з and .5000 for Po1.25 іп Eq. 16.8a yields 
Bos.12 + (.60 x .1875) + (.50 x .5000) = .30 (16.8d) 
or 
Bos.12 = —.0625 (16.8e) 


THE MATRIX SOLUTION 


Identical results can be obtained by abandoning the apparatus of the for- 
mal equations and working only with coefficients. Since each of the co- 
efficients representing the intercorrelations appears twice in the equations, 
the procedure can be simplified by using a format in which each such 
Coefficient is written only once. In Example 16.1, all the arithmetical 
Operations used above in solving the equations are exhibited in compact 
format. In addition, there is a line in each successive matrix in which the 
variance of the criterion is reduced, as described in Chapter 7. 


EXAMPLE 16.1. FINDING BETAS BY REDUCTION ROUTINE AND BACK SOLUTION 


VARIABLES 
VARIABLES FINAL BETAS 3 2 1 0 
3 (—.0625) 1.000 .600 500 300 
2 (.1875) .600 1.000 .700 .500 
1 (.5000) .500 1.000 .600 
0 .300 1.000 
2.3 (1875) .640 400 320 
1.3 (:5000) .625 -750 450 
0.3 .500 910 
1.23 (:5000) .500 .250 
0.23 .500 1750 
0.123 -625 

Ro.123. = V 1 — .625 = .61 
-250 


ras = 57-500 86 = 41 


The matrix routine in Example 16.1 can be interpreted as: 


1. Solving symmetrical simultaneous linear equations with steps identical 
to those used in Eqs. 16.8a through 16.8e; 
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2. Forming successive matrices of variances and covariances of residual 
variables in z form, as described in Chapter 7; 

3. Evaluating the determinant of the matrix of r’s by condensation, as 
discussed below; or 

4. Beginning a process that would lead to the inverse. 


In this case, as in many other cases, one and the same numerical routine 
has alternate mathematical interpretations. When operations are con- 
ceptualized in terms of matrices and determinants, the advantage lies in 
succinct and clear-cut statements of the operations performed. 


DETERMINANTS 


Any square matrix of numbers can be regarded as a determinant,! thereby 
acquiring new mathematical properties, including a single, fixed numerical 
value. 

In statistics, the variances and covariances of any set of variables can 
be arranged in a square matrix, which may then be considered a determi- 
nant. The principles of determinants often permit inferences about large 
groups of data. For example, if the determinant of a variance-covariance 
matrix is zero, then the multiple correlation of every variable in the 
matrix with all remaining variables is precisely 1.00. 

A special case of a variance-covariance matrix is a matrix of correlation 
coefficients with unity in each of the diagonal cells. The upper limit of the 
determinant of such a matrix is 1.00, found only when all intercorrelations 
are precisely .00. 

When a determinant is evaluated, its numerical value is found. Evalu- 
ation may be by any one of several routines which superficially appear to be 
widely different, yet all are precisely equivalent and yield identical results. 

In a variance-covariance matrix, the sequence of the variables in the 
rows is the same as the sequence of the variables in the columns. Changes 
in this sequence have no effect on the value of the determinant. If rows or 
columns are interchanged, the determinant changes sign, but in statistical 
Work, pairs of rows and corresponding pairs of columns are ordinarily 
interchanged simultaneously, an operation that does not change the value 
of the determinant at all. 


TERMINOLOGY AND NOTATION OF DETERMINANTS 


When a matrix, A, is regarded as a determinant, it is customary to border 
it on either side with a single line. The border lines may be applied either 


1 Mathematicians often define a determinant as a scalar function of a square matrix; 
that is, а unique numerical value derived from the matrix by a specified operation. 


Here, for ease in presentation, we have denoted a square matrix that can be so evalu- 
ated as a determinant. 
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to the letter denoting the matrix, as |А), or to the display of the elements, 
or both. Another general symbol for a determinant is the Greek capital 
letter A. Occasionally D is used. Thus a determinant of the intercorrela- 
tions of variables 1 through 4, with the z-score variance of unity in each of 
the diagonal cells, is 


1.00 ғ; тз ri 
д |1 1.00 ғ: г 
r31 T32 1.00 r34 
Ya, Yaz Газ 1.00 


MINORS OF A DETERMINANT 


Determinants formed by eliminating one or more rows and the same 
number of columns from a determinant are called minors of that determi- 
nant. A first minor is formed by the elimination of one row and one 
column; a second minor, by the elimination of two rows and two columns; 
and so on. A principal minor is formed by eliminating corresponding 
Tows and columns; consequently, the principal diagonal of this type of 
minor is composed exclusively of elements of the principal diagonal of 
the original determinant. 

A first minor with sign assigned in a special manner is called a cofactor. 
The evaluation of any minor (as of any determinant) may result in zero 
Or in a positive or negative number. Let i be the number of the row that 
has been eliminated and j the number of the column. Then (— 1)'* is an 
additional sign that helps to determine the final value of the cofactor. 
Suppose the value of a minor, found by eliminating the second row and 
third column is —.2; the value of the cofactor would be (—1)?*? (—.2) 
or +.2. If i and j add to an even number, the minor and the cofactor аге 
identical. 

Minors may be identified by a superscript indicating the order of the 
total determinant and by subscripts indicating the row and column or the 
rows and columns that have been deleted. Thus, Аз" would be a first 
minor formed by deleting the second row and third column from a deter- 
minant of order n; and the corresponding cofactor would be (— 1)? *3A,," = 
— A55". Similarly, А”, зз would be the principal second minor [of order 
(n— 2)] formed by deleting the second and third rows and the second 
and third columns. 


EVALUATION OF DETERMINANTS 


By definition, a determinant of order n is the sum of factorial n terms, 
half with attached positive sign and half with attached negative sign, and 
each the product of п elements. The п elements are assigned to each term 
in such a manner that no two of them come from the same row or column. 
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The direct method of evaluation, implicit in the definition, is excellent 
for evaluating determinants of order two or three, but when the order is 
greater than two or three, other methods are more convenient. 

Direct evaluations are shown below. In finding the value ofa determinant 
of order two, the elements in the main diagonal are multiplied together. 
From this product is subtracted the product of the other two terms. Thus, 


ab di; а 
= ad — bc or = 441422 — 05,043 
с й аз; a2 


In evaluating a determinant of order three, the three products with 
attached positive sign are formed by multiplying groups of three elements 
in the diagonals from the upper left to lower right. In the two instances 
in which there are two elements in a diagonal, the third is the isolated 
element farthest away. Thus, in Eq. 16.16 below, the three terms that are 
added to find the determinant are aei, bfg, and dhc. The three negative 
terms are formed similarly, but by working in the diagonals from lower left 
to upper right. Accordingly, the terms are —gec, —hfa, and — dbi. Thus, 


a b c 
d e f|-—aei- bfg + dhe — gec — hfa — dbi (16.16) 
g h i 


or in notation with subscripts indicating rows and columns, 
411 412 4143 


421 a22 433 = 411422433 + 0240550153 + 042023051 

a = 431422413 — 32423011 — 421412433 (16.16а) 

зї (432 033 
A special case is a symmetrical determinant of order three, with unity 


in each of the diagonal cells. By the procedure given above, it will be 
found that 


1.00 riz т» 
А-і|ті 100 rj|—1-2rrr—ri5?-r,5-rj44? (16.16) 
F31 F32 1.00 
in which rj; = r34, гүз = зү, and гуз = rs. 
While these procedures may seem to be arbitrary, they are actually based 


on efficient and systematic manipulations of coefficients when linear 
equations in n unknowns are solved. 


A second procedure is based on the following rule, by means of which 
determinants of order n can be evaluated in terms of determinants of 
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order (n — 1): Any determinant is the sum of the elements of any row or 
column of that determinant, each weighted by the cofactor formed by 
eliminating from the determinant the row and column in which the 
element appears. Thus Eq. 16.16a may be evaluated as follows: 


411 41: 44 
аз аз) аз-ақ 


азі ау 433 
= 4114324); - 011033053 - 421412433 t 491432043 
+ 434442423 — 031025015 (16.16с) 


It will be noted that although terms appear іп different orders and 
although elements within terms are sometimes differently arranged, the 
expansion of the determinant in Eq. 16.16a gives exactly the same result 
as the expansion of the determinant in Eq. 16.16c. 


PIVOTAL CONDENSATION 


Large determinants are often solved by a third routine, known as pivotal 
condensation. This is based on three principles: 


l. If each element of a vector of a determinant is multiplied by any 
factor, the value of the determinant is multiplied by that factor; 

2. When the elements of a vector of a determinant are multiplied by any 
factor and the products are added to (or subtracted from) corresponding 
elements of any other vector, the value of the determinant is unchanged; 
and 

3. When all the elements above or below the main diagonal are zeros, the 
value of the determinant is the product of the diagonal elements. 


By means of the second principle, most determinants encountered in 
Practice can be transformed into triangular form, usually with zero 
elements below the principal diagonal. When a determinant is in triangular 
form, it is evaluated as the sum of n! terms, each the product of л elements. 
However, each of the n! terms except one contains at least one zero. 
Hence, the determinant may be evaluated by multiplying together the 
final elements in the main diagonal, since this is one of the n! terms of the 
determinant and it contains no zero elements unless the value of the 
determinant is zero. 

The principle of multiplying all the elements of an array by a constant, 
thus changing the value of the determinant by multiplication by that 
constant, sometimes facilitates computation. When used in connection 
with pivotal condensation, any such multiplier becomes the multiplier 
of the product of the diagonal elements. 
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SOME STATISTICAL APPLICATIONS OF DETERMINANTS 
MULTIPLE R IN TERMS OF DETERMINANTS 


The procedure of finding multiple R by reduction of the variance of the 
criterion variable, as described in Chapter 7 and illustrated in Example 16.1, 
can be considered as one application of the theory of determinants to a 
statistical problem. If the original matrix of r’s, with unity in the diagonal 
cells, is considered as a symmetrical determinant, then multiples of the 
top row are subtracted from the other rows in such a manner that the 
first column is modified until it consists of the pivotal element and zero 
elements. The routine of dividing all other elements in the top row by the 
pivotal element can be considered to be a method of determining the 
proper multiplier to apply to the top row so as to subtract the appropriate 
amounts from the elements in succeeding rows. After the elements in the 
first column below the pivot are reduced to zero, a second cycle of oper- 
ations reduces to zero the elements below the diagonal in the second 
column; and so on, until a triangular matrix results. (The method is also 
applicable to nonsymmetrical determinants, but the multipliers would have 
to be obtained directly by dividing the element to be reduced to zero by 
the pivot.) 


The matrix of r’s, with unity in the diagonal cells, given in Example 16.1 


can be identified as A"*°, with n the number of predictor variables and 0 as 
the criterion. Then 


1.000 .600 .500 .300] [1.000 .600 .500 .300 
Arto _ | 600 1.000 .700 .500 _ | 000 .640 .400 .320 
500 700 1.000 .600| | .000 .000 .500 .250 
300 .500 .600 1.000 -000 .000 .000 .625 


= (1.000)(.640)(.500)(.625) = .200 


Statistically, the determinant of a variance-covariance matrix is the 
product of the pivotal variances. In this instance, with the variables in 
zero-order or higher-order z scores, and with the order of elimination 
3, 2, and 1, 

Ant? = V3V4 з 35Vo.123 (16.17) 


Since the value of a determinant is not altered when rows are inter- 
changed and when corresponding columns are interchanged simul- 
taneously, the value of the determinant is not affected by the order of 
elimination. This principle is sometimes useful in finding several multiple 
R’s from the same matrix. 

To find the partial variance of the criterion variable, in this case V, 123, 
it is apparent from Eq. 16.17 that it is necessary only to divide the total 
determinant by the determinant of the first л variables; that is, the minor 
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after the row and the column of the criterion variable have been dropped 
out. In general, 


Ante 
Голон = снб (16.18) 
00 
in which the two subscripts 00 indicate that row 0 and column 0 are no 
longer in the determinant. 
Since the square of multiple R is unity less the partial variance of the 
criterion, a formula for R in determinantal notation is 


AT CAMS (16.19) 


PARTIAL r FROM DETERMINANTS 


Formulas for other coefficients used in describing multivariate linear 
relationships, including higher-order covariances, beta weights, and 
Specialized correlation coefficients, can be expressed in determinantal 
notation. As a simple example, here is a formula for a partial correlation 
of the (n — 1)st order in a matrix of (n + 1) variables: 
nO 
EM еген (16.20) 


01.23 en = J TO JARO 
жек NS А" 
00 11 


in which the three determinants are first minors of the complete determi- 
nant. 

In Example 16.2, the principal first minors and an additional first 
Minor of the correlation matrix given in Example 16.1 are evaluated 
and applied to multiple and partial correlation. In Eq. 16.17, the value of 
A"*°, the total determinant of four variables, was found to be .200. 


EXAMPLE 16.2 


MULTIPLE AND PARTIAL CORRELATION BY DETERMINANTS 


By Eq. 16.17, A^*? (comprising the z-score variances and covariances of 
Variables 3, 2, 1, and 0) has a value of .200. Variables in each of the minors below 
are identified by numbers within parentheses. Evaluation is by the procedure 
Tepresented in Eqs. 16.16, 16.16a, and 16.16c. 


о) а) © 
(2) |10 7 .5 | = Ио + 2rigrozrox — Voriz? — Viros — Voroi? 
Аз =()] 710 .6 
(0! .5 6 10| =1.00 + .42 — .49 —.25 —.36 = .32 
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(3) 
An’ = (1) 
(0) 


(3) 
Аи? = (0) 


(0) 


(3) 
00° = (2) 
(1) 


Q) 
Аф» = 02) 
(1) 


G) а) (0) 
105 3 
5 10 6 
3 6 10 
з) Q (0 
L0 .6 3 
6 10 .5 
3 5 10 
з) Q () 
10 6 5 
6 XO 7 
S Т 10 
з) 0) © 
10 6 3 
6 10 .5 
S d 6 


= VsVAVo + 2riarosro — Voris? — Viros? — Veror? 


1.00 + .18 — .25 — .09 — .36 = .48 


= VaV2Vo0 + 2rearoaros — Vores? — Voros? — Varoo? 


= 1.00 + .18 — .36 — .09 — .25 = .48 


Va Va Vi + 2reariarie — Vires? — Voris? — Vario? 


= 1.00 + .42 —.36 — .25 — .49 = .32 


= Va Vero + resrieroa + resrozris — ria Veros — 
2 
ri2ro2 Уа — res"ro1 


= .600 + .126 + .150 — .150 — .350 — .216 = 16 


20 An 72220 
Rouz) - |. Aa = / — 32 = .61 


Дно 72-20 
Кі (оез) - Аз” J/ 28—76 


ОЕ Ano — JO 
= — =.76 
LAE / к= 


Apto 16 


701.23 


- = = 41 
М At V AH 3/32 -V.48 


By the determinantal routine, Roq23) is .61, exactly the same as by 
reduction of criterion variance. Similarly, two apparently different routes 
lead to a value of .41 for ro, зз. However, while the methods appear to be 
different, the underlying mathematics is actually much the same. 


DETERMINANTS IN SUMMARIZING MATRICES 


A. complete matrix of data, such as table of intercorrelations, can be 
sometimes summarized usefully as a determinant. For example, when a 
multiple correlation is 1.00, the partial variance of the criterion, with 
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variance associated with all other variables removed, is .00. Since this 
Partial variance is one of the factors that are multiplied together in one of 
the methods of evaluating the determinant, it follows, as stated earlier. 
that when R = 1.00, the determinant is zero. 

If the determinant is positive, all multiple correlations throughout the 
matrix are less than 1.00. If the determinant is negative, one of two con- 
clusions is necessarily true. Either there have been errors in computation 
in finding the correlations or in evaluating the determinant; or, within 
the matrix, there are variables with insufficient freedom, as when there are 
More variables than cases. In the latter instance, the expected determinant 
is zero, but through rounding error it may be slightly negative. One 
degree of freedom is lost with each variable entering into a multiple R. 
When the number of variables increases beyond the number of cases, 
there are variables without freedom. Multiple R computed on data with 
markedly limited degrees of freedom are useless. 

If the determinant of a matrix is zero (or “vanishes”), the matrix is 
called singular. A singular matrix has no inverse. If the determinant does 
Not vanish, the matrix is nonsingular and has an inverse. A Gramian 
matrix is symmetric, and all its principal minors are equal to, or greater 
than, zero. Most variance-covariance matrices are Gramian. Exceptions 
indicate that at least one of the variables is completely a function of one 
9r more of the other variables in the matrix. 

Certain other principles of determinants are sometimes useful. For 
example, the determinant of a matrix is equal to the determinant of its 
transpose. If A is the matrix, then 


| Al = |A7| 


Also, if A and B are square matrices of the same order, the determinant 
of their product equals the product of their determinants. That is, 


|A| |B| = |AB| 
Some matrix theorems involve the “adjugate” or “adjoint” matrix, 
formed by replacing each element of the square matrix, A, by its cofactor, 
and then writing the transpose of the result as adj(A). A roundabout way 
of finding an inverse is by the formula 


1 
A`? = —_adj(A) 16.21 
Adj(A) exists for any square matrix even if the determinant, lA]. is zero. 


It is apparent from Eq. 16.21, however, that if |A| is zero, A^! does not 
exist, 
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FINDING THE INVERSE OF A MATRIX 


Equation 16.5a, which defines the inverse, also provides a method of 
finding it. Since A is a square nonsingular matrix for which А”! exists, 
AA =], 

Both A and I can be premultiplied by a series of appropriate matrices 
of the same order, successively modifying A until it becomes the identity 
matrix I, while exactly the same operations on I result in the inverse. 

If the premultiplier is the identity matrix except that а, the diagonal 
element in the ith row and ith column, is a instead of 1, then the elements 
in the ith row of A and the ith row of I are multiplied by a. If the pre- 
multiplier is the identity matrix except that a;;, the element in the ith row 
and jth column, is a instead of 0, then the elements in row i of the matrix 
operated on are multiplied by a and added to the elements in row j. These 
are, of course, exactly the same operations described earlier for use in 
evaluating a determinant, and several operations may be accomplished 
simultaneously. 

In Example 16.3 are shown numerical operations required by this 
method in finding an inverse. In this case it is the inverse of the matrix of 
the three predictor variables of Example 16.1. In five steps, A on the left 
side of the equation is transformed into I, the identity matrix, while on 
the right-hand side of the equation, I is in parallel steps transformed into 
the inverse, АС}. 


EXAMPLE 163 _ 


COMPUTATION OF THE INVERSE: THE USE OF OPERATIONS IN 
Ri,Rz, AND Ra, THE ROWS OF A AND I, THUS REDUCING THE EQUATION 
(АА-! = 1) TO THE IDENTITY (IA! = Ал!) 


Operation іп Yield ||1.000 .600 .500 1.00000 .00000 .00000 
Preceding іп New .600 1.000 .700| -|A-!| = .00000 1.00000 .00000 

Matrix Matrix .500 .700 1.000 .00000 .00000 1.00000 
Ri —-> Ri 1.000 .600 .500 1,00000 .00000 .00000 
Ra-.6R; — R: 1000 .640 .400 [А-Ч = || —.60000 1.00000 .00000 
Ri-.5R; — Rs 1000 | .400  .750 —.50000 .00000 .00000 


1.00000 .00000 .00000 
.000 .640  .400| -|A-1| = || —.60000 1.00000 .00000 
.000 .000 .500 —.12500 —.62500 1.00000 


1.000 .600 E 1.12500 .62500 — 1.00000 
la- = 


Ri --> Ri 
Ка ----> R: 
Rs-.625Rs — Rs 


1.000 .600 .500 | 


Ri- Rs —— Ri 
Ra-.8Rs — Ra 
Rs —— Rs 


.000 .640 .000 —.50000 1.50000 --.80000 
.000 .000 .500 —.12500  —.62500 1.00000 


К; – .9375К > Ri 1.000 .000 .000 1.59375 —.78135 —.25000 
Ra => Жа 000 .640  .000| -|A-1| = || —.50000 1.50000 --.80000 
ар 8: 185 .000 .000 .500 —.12500 --.62500 1.00000 
Ri —— Rn 1.000 .000 .000 1.59375 --.78125 --.25000 
1.5625 Ва ——> Ra -000 1.000 .000 |A-3| = | —.78125 2.34375 —1.25000 
2Rs ———— Rs :000 .000 1.000 —.25000 — 1.25000 2.00000 
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| By the usual rules of matrix multiplication it сап be determined that the 
inverse has actually been found. 


A A^! = I 
1.000 .600 .500 1.59375 —.78125 —.25000 1.000 .000 .000 
-600 1.000 .700 —.78125 2.34375 --1.25000 || = || .000 1.000 .000 
-500 .700 1.000 —.25000 —1.25000 2.00000 1000 .000 1.000 


It is also of interest to apply Eq. 16.11b, (B = R~'C,), to finding В, the 
vector of betas, from R^ !, the inverse of the matrix of predictors and C,, 
the vector of validity coefficients as used in Example 16.1. Then 


1.59375 —.78125  —.25000 .300 —.0625 
В-| —.78125 2.34375 -1.25000 || · || -500 || = .1875 | (16.22) 
—.25000 — 1.25000 2.00000 .600 .5000 


As would be expected, the betas are identical with those obtained in 
Example 16.1 by pivotal condensation and a back solution. 


INVERSE OF THE COMPLETE MATRIX IN CORRELATIONAL ANALYSIS 


In Example 16.3 only the inverse of the matrix of three predictors is 
obtained. By the same method, but with additional steps, the inverse of 
the complete matrix of the four variables of Example 16.1 can be found. 
With identification of the variables in parentheses, it is 


(3) (2) (1) (0) 


(3) 1.6000 — .8000 — .3000 .1000 

., .Q)| —.8000 24000 -11000 —.3000 

R=! =|| —.3000 —1.1000 24000 —.8000 (16.23) 
(0 || 100  —.3000  —.8000 1.6000 


Each diagonal element in the inverse of any matrix of correlation 
Coefficients has definite analytic meaning: It is the reciprocal of the partial 
Variance of the variable concerned, after variance predictable from all the 
Other variables has been removed. Thus, in the lower right-hand corner, 
1.6000 is 1/У,1:з. Accordingly, Vo.123 is 1/1.6000, or .6250, exactly as 
found in Example 16.1, while Ro.123 is У1 — .6250, ог .61, also exactly 
as found in Example 16.1 and in the determinantal solution. 

The off-diagonal elements in rows or columns are functions of the 
final-order betas, with the variable denoting the row or column as the 
Criterion. They are betas, reversed in algebraic sign, and divided by the 
final partial variance of the criterion; that is, the criterion variance less 
the portion predictable from all other variables in the matrix. For example, 
In Eq. 16.23, the analytic meaning of the elements in the fourth row of the 
inverse сап be taken to be — fos 12/Vo.1234 — Bo2.13/%0.123» —Bo1.23/Vo.123 
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and 1/Vo.123- Accordingly, to obtain the betas, each off-diagonal element 
is reversed in sign and divided by the diagonal element. Thus, — .1000/ 
1.6000 = —.0625; —(—.3000/1.6000) 2.1875 and —(—.8000/1.6000) = 
Ed exactly the same betas as found in Example 16.1 and again in Eq. 


FINDING THE INVERSE OF AN ASYMMETRICAL MATRIX 


The method demonstrated in Example 16.3 of finding the inverse is 
applicable to any nonsingular square matrix, whether symmetrical or not. 
The routine is straightforward. In the first cycle, multipliers of row 1 are 
chosen so that all off-diagonal elements in column 1 are reduced to zero 
when multiples of row 1 are subtracted from the other rows. In the second 
cycle, multipliers of row 2 are chosen so that all elements in column 2 
below the diagonal can similarly be reduced to zero, and so on, until the 
matrix becomes triangular. Then, in a back routine, multiples of the bottom 
row are found to reduce the off-diagonal elements in the last column to 
zero, again by subtraction. The routine is continued with successive 
columns. Finally, multipliers are applied to the rows to make all diagonal 
elements unity. As operations reduce matrix A to the identity matrix, 
corresponding operations in the matrix, which starts as I, result in the 
inverse, A ^ !. 

There are numerous alternate procedures for finding the inverse, most 
of them variations of the method illustrated in Example 16.3. Of particular 
interest in finding the inverse of a correlation matrix is a procedure (1) 
that preserves the analytic meaning of all elements at every stage. Many 
of the large electronic computers have routines for finding the inverses of 
large matrices of various descriptions, especially since matrix inversion is 
of interest to many fields other than statistics. 


AN EXERCISE IN READING MATRIX EQUATIONS 


If an inversion method is applicable directly only to symmetric matrices, 
it nevertheless can be used to find the inverse of a nonsymmetrical, non- 
singular matrix. The derivation of this procedure provides an exercise in 
reading matrix equations. 

Let A be any nonsymmetric matrix of order n for which the inverse, 
АТ", exists. When a nonsymmetric matrix is multiplied by its transpose, 
the result is a symmetric matrix. Let S be the symmetric matrix of order n 
found by post-multiplying A by its transpose A’. Accordingly, AAT = $. 

When а matrix is moved from опе side of an equation to another, it 
appears on the other side as an inverse; and when two or more matrices 
are moved, they appear as inverses in reverse order. Thus, 


S^! = (АТ)-1А-! (16.24) 
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Both sides of Eq. 16.24 can now be post-multiplied by A. It will be 
remembered that A^'A equals the identity matrix and I as a multiplier 
can be dropped out of an equation. Accordingly, 


S^!A = (AT) 'A"!A = (ATT = (A)! (16.25) 


Since the inverse of a transpose is equal to the transpose of the inverse, 
Eq. 16.25 can be rewritten as 


S^!A = (А 1)7 (16.25а) 


A matrix equation remains ап equality if the transpose is taken of both 
sides. This yields the formula for the inverse of a nonsymmetrical matrix 
in terms of a symmetrical matrix formed from it: 


(Sou =A (16.26) 


This compact formula implies that one way of finding the inverse of a 
nonsymmetric matrix is by the following four steps: 


1. Post-multiply the matrix by its transpose. This yields a symmetric 
matrix. 

2. Find the inverse of this symmetric matrix. 

3. Post-multiply this inverse by the original matrix. 

4. Transpose the result. This is the inverse of the original nonsymmetric 


matrix. 


SUMMARY 


In psychological statistics the concept of a matrix, in which rows and 
columns have assigned meanings, is applicable to rosters of scores and to 
complete tables of variances and covariances and of correlation co- 
efficients. Operations that can be described in terms of succinct matrix 
Notation include the computation of variances, covariances, and corre- 
lations from the score matrix. 

By regarding a square matrix as a determinant, new mathematical 
Properties emerge. For example, a determinant has a unique value, which, 
for variances and covariances, cannot be negative. Formulas for multiple 
and partial correlation can be written in terms of total and minor determi- 
nants. 

A matrix equation may summarize an indefinite number of simultaneous 
linear equations, such as the “normal” equations in which the beta 
Coefficients are the unknowns. Matrix routines that involve finding the 
inverse of a matrix of correlations (with unity in each of the diagonal 
Cells) can be used to find final-order betas. In advanced statistical work, 
equations in terms of matrices have great importance. 
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EXERCISES 


1. Given X, the matrix of the deviation scores of ten individuals on three vari- 
ables, find the product matrix by the operation X7X. Convert the matrix of 
values of хі” and Xxix, into а variance-covariance matrix by multiplying 
each term by the scalar 1/N. 

Returning to matrix X, divide each column by its standard deviation, thus 
forming a matrix of z scores, designated as Z. 

Post-multiply the transpose of Z (that is, 2/7) by Z, and divide each element 
of the product matrix by N, thus forming R, the matrix of correlation co- 
efficients with unity in the diagonal. (This is also a variance-covariance matrix 


of z scores.) 
Compare the R matrix so formed with an R matrix computed directly from 


the matrix of values of Xx? and Xixix;, using the conventional formula, ry = 


Ххх V (Exe) (х3). 


=J сй —8 
+4 +6 —3 
+2 0 +6 
= 9 —3 --9 
X= | = 6 —9 =5 
= 1 +4 +8 
+3 --3 —6 
+ Ж% РА. 9 
- 5 0 +3 
11 +2 +8 


2. Given R as follows, evaluate the determinant by finding the sum of six terms, 
and also evaluate the determinant by pivotal condensation. 


1.00 .48 18 
R= 48 1.00 .53 
48. .53 1.00 


3. Find R-1, the inverse of R in Exercise 2. 


4. The following intercorrelations of two motion picture tests of attention and 
two stanine scores were reported by Gibson (2): 


2 З 4 
1 Flexibility of attention 40 .35 23 
2 Integration of attention 36 25 
3 Bombardier stanine 71 
4 Navigator stanine 


Evaluate the total determinant. Then, by evaluating the four principal minors, 
find А. (оза), Resa), Rsa24), and Raa23). 


5. Find the inverse of the matrix in Exercise 4 and from the diagonal terms 
find Ri(284), Resa), Rsa24) and Raass). 
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6. Stern and Gordon (3) report the following matrix of intercorrelations of three 
predictor variables and a criterion, recruit final achievement, for 511 naval 


recruits: 


2 3 4 
1 General classification test .28 .54 .58 
2 Clerical test 36 18 
3 Oral direction test 36 
4 Recruit final achievement 


Find the inverse of the matrix composed of the z-score variances and co- 
variances of variables 1, 2, and 3. From the inverse so formed (R-!) and the 
vector of the three validities (Cy) find the final-order betas for predicting 
recruiting final achievement by matrix multiplication (Eq. 16.11b) as follows: 


RC. =f 
7. Show that the following cannot be considered a correlation matrix: 
1 2 3 
1 1.00 30 80 
2 30 1.00 90 
3 80 90 1.00 


8. For each of the following pairs of concepts give an important way іп which 
they are alike and an important way in which they are different. 


(a) Matrix and determinant. 

(b) Inverse and transpose. 

(c) Scalar matrix and identity matrix. 

(d) Matrix multiplication and addition of matrices. 
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INTRODUCTION 
TO FACTOR 
ANALYSIS 


17 


Factor analysis is a branch of statistics concerned with the isolation and 
identification of a limited number of hypothetical variables underlying a 
group of observed variables. The factors so discovered are hypothetical in 
the sense that, while scores or values for specific cases can sometimes be 
estimated, they can never be computed precisely. As are the variables 
entering into coefficients corrected for attenuation, factors are known by 


their correlations and their variances. 
When a correlation coefficient is corrected for attenuation, the 


variability resulting from random error can be considered to be partialed 
out from both variables. Somewhat similarly, each variable in a factor 
analysis is treated as though it included neither random error nor specific 
variance, the latter being reliable variance not shared with other variables 
in the matrix being analyzed. In effect, all "unique" variance (that is, 
variance not common with other variables) is partialed out of each variable 
before the analysis is begun. This is done by reducing the observed 
variances, the 1.00's in the diagonal of the matrix of r’s, to “communali- 


ties." 


TWO PHASES OF A FACTOR ANALYSIS 
Typically a factor analysis has two phases: 


1. The separation of the common variance, or communality, of each of 
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the variables into a minimum number of uncorrelated portions repre- 
senting the underlying factors; and 
2. The identification of the factors as meaningful. 


In the first phase, a small number of theoretical and unobservable 
factors are inferred, by which the matrix of intercorrelations of a group of 
observed variables can be reproduced as closely as possible. In the second 
phase, alternate sets of /oadings, or correlations of the observed variables 
with the hypothetical factors, are computed until a set is obtained that 
appears to be meaningful in understanding the original variables. In this 
process, alternative solutions are compared as to their scientific accepta- 
bility, and sometimes factors are modified so that they become correlated. 

Factor analysis starts with a matrix of correlation coefficients, which 
may be considered as covariances in z form. What can be discovered in a 
single study is limited by, and must be interpreted in terms of, the observed 
variables that happen to constitute the matrix. However, factors identified 
by well-known variables in one matrix may be identified again by the same 
“marker” variables in other matrices. 


CASE OF A SINGLE COMMON FACTOR 


Factor analysis originated in the work of Charles Spearman (10), who 
is chiefly responsible for the basic concepts and formulas useful when a 
matrix of correlations can be explained as arising from a single general 
factor. 

In z form, let any variable, z; be made up of two uncorrelated portions 
(also in z form with unit variance): ga, which appears also in other variables ; 
and u;, which is unique to z; and which includes both specific and error 
variance. By definition, и; is uncorrelated not only with g,, but also with 
the unique portion of variables other than variable i. Let a; be the weight 
of g, in variable i and w; be the weight of u;. The basic equation is then: 


Zi = diga + Witi (17.1) 


Squaring both sides of Eq. 17.1, summing the variables while leaving the 
Constants outside the summation signs, and dividing by N yield 
5z? 2 Ega Egali 


= а —— +2aw; 
NT 


2 
4 «Tuc 


(17.2) 

Since 2,, ga and и; all have variances of 1.00, and since Xg,u;/N is .00, 
Eq. 17.2 becomes 

a? +w? =1 (17.3) 

which shows that the variance of variable i, taken as 1.00, can be divided 

into two portions: а;2, the common variance, and u;?, the unique variance. 
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It is to be noted that aj? is the communality, the more general notation 
for which is /;?, always used when there is more than one common factor. 
Multiplying both sides of Eq. 17.1 by g,, summing, and dividing by 
N yield 
Èz; = Хи; 
“oe a; + n (17.4) 


Since both z; and g, are in z form with unit variances, their covariance 
on the left side of Eq. 17.4 is their correlation. The expression Xg,?/N as a 
variance of unity can be omitted. Since the unique portion of the variable 
u; and the general factor g, are uncorrelated by definition, their covariance 
is zero and the final term of Eq. 17.4 drops out. Accordingly, it can be seen 
that Figa the correlation between any variable, denoted as variable i, and 
the single factor explaining the intercorrelations of a group of variables, 
is a, which was originally defined as the weight of the variable in factor g,. 

Consider a second variable j, defined in manner parallel to variable i, 
as follows: 

Zj = ауда + иу (17.5) 


Note that variable j is made up of two uncorrelated portions, д, and и, 
with weights of a; and w,, respectively. The common factor g, is the same 
as found in variable i, but u; is uncorrelated with any other unique factor. 
Accordingly, when we multiply Eq. 17.1 by Eq. 17.5, sum all terms, and 
divide by М, expressions involving either и; or и; (or both) drop out as 
covariances of uncorrelated variables and the result is 


Уа Eg, 
FL aa, =e (17.6) 


Since any correlation coefficient is the covariance between two sets of 
z scores with unit variance, the expression on the left of Eq. 17.6 is equal to 
rij On the right-hand side, Ха. |М is the variance of the general factor 
9. and, by definition, is 1.00. Accordingly, Eq. 17.6 becomes 


тақау (17.7) 


a fundamental equation stating that if a single factor is responsible for the 
intercorrelations of a group of variables, each correlation is precisely equal 
to the product of a pair of factor weights. The equation provides a basis for 
finding the unknown factor weights from an observed matrix of corre- 
lations. 

To find the numerical value of a;?, the communality of variable i, three 
correlations among the three variables ть 2, and z, are needed, all 
expressible as the products of factor weights as follows: 
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Til _ сада, | PONE. (178) 
1 = н 4g . 
Tj аҙа, М 


Taking the square root of both sides of Formula 17.8 yields 


Flu 
Tg, = d; = (ша - (17.9) 


which is the correlation between variable jand the factor central to variables 
i, j, and k. Accordingly, a; is the factor loading of variable i in the posited 
general factor g,. 

Neither Formula 17.8 nor Formula 17.9 is universally true. If Formula 17.8 
yields values that are negative or greater than 1.00, then it is impossible to 
account for the intercorrelations of variables i, j, and k as resulting 
from a single general factor. Furthermore, even when values from 
Formula 17.8 are between .00 and 1.00 it may be necessary to posit loadings 
in more than one factor in order to make the analysis of variables i, j, and 
k consistent with the analysis of other variables in the matrix. 


AN ARTIFICIAL MATRIX 


An artificial matrix, generated by a single common factor, is shown as 
Table 17.1 on page 457. Points about this matrix are: 

1. Factor loadings, or correlations between the observed variables and 
the general factor g,, are shown in the first column and top row, and are 
outside the matrix proper. Although these loadings were used to develop 
the matrix, in the real situation it would be necessary to compute them 
from the intercorrelations. 

2. Communalities, denoted as /?, are in the main diagonal. In the 
Teal situation, these, too, would be found from the intercorrelations. In 
the single factor case, each communality is the square of the corresponding 
factor loading, so that 4? = а. 

3. The matrix proper is symmetrical about the main diagonal. Each 
Correlation (or covariance in z form) appears twice. Above the diagonal 
In the algebraic representation, the correlations are identified by the factor 
Weights by which they were generated; below the diagonal they are denoted 
Simply as ғ. 

4. In terms of matrix algebra, the matrix is described as of rank one. 
In factor analysis, this means that it can be generated by a single set of 
factor loadings, or conversely, when a single factor is partialed out, the 
“residuals,” or resulting partial covariances, are all zero. The communali- 
ties in the diagonal are also reduced to zero, but in the real situation they 
are unknown, except as they are computed so as to be consistent with the r’s, 

5. Since the variance of the general factor g, is 1.00, the top row of 
factor weights can be regarded as a vector of covariances, and the column 
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of loadings can be regarded as a vector of betas. Accordingly, by the 
formula for partialing, given in Chapter 8 (E' = E — BC), each element of a 
new matrix of coefficients of the six original variables, with the general 
factor partialed out, would be .00. Thus it can be seen that the objective 
in factor analysis of reducing observed covariances to zero with the fewest 
possible posited variables is attained. In this matrix it is sufficient to 
partial out a single common factor. 

6. The general factor could be partialed out just as easily if variances of 
1.00 instead of communalities were written in the diagonal. However, the 
use of observed z-score variances would increase the rank of the matrix to 
n, the number of variables. Each partial variance remaining in the diagonal 
of the residual matrix would then represent the unique proportion of the 
original variable. 

7. Whenever any matrix is of rank one, all second-order minors, when 
evaluated, equal zero. As noted in the preceding chapter, a second-order 
minor is a determinant of two rows and two columns abstracted from a 
larger determinant by eliminating the other columns and rows. Format and 
evaluation follow. 

a b 
e d 


Rows and columns retained to form the minor need not be adjacent, 
and four different variables may be concerned. Thus, for variables 1, 5, 4, 
and 6, we have 


|= ad- be 


4% 
Alfa: Td [c Ol е оше 95 (— 28у 40 
ÉL Е etum ae ) 


= —.084 + .084 = .00 


In terms of products of factor loadings of variables i, j, k, and /, this 
becomes 


i j 

КРЕ ғ, аа, аға 

D Jp = aaaja — ajaaa = .00 
Ға Ти аа, aja, 


This is Spearman’s tetrad criterion, which һе used in identifying variables 
with a single common factor. 
8. An important application of the evaluation of a second-order minor 


comes in the case of three variables, when one of the elements is a com- 
munality. Thus, 
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If the matrix is of rank one, the expression must equal zero. In the real 
case, the correlations would be known and the communality unknown. 
Solving for h;? yields 


п2 = Du (17.82) 


which is an alternate statement of Formula 17.8. The formula is appropriate 
when there is evidence that a matrix, or a part of a matrix, is of rank one. 

9. By closer examination of Table 17.1, it may be noted that all columns 
of coefficients correlate perfectly. The coefficients in any vector i are 
proportional to the coefficients in any other vector j. This is necessarily 
true if the matrix can be explained on the basis of one general factor. 
The converse is not necessarily true. 

When all columns correlate perfectly and when all tetrad differences of 
the type (ғат — rj,ri) equal zero, it is likely, but not certain, that the 
matrix, excluding the diagonal terms, is of rank one. In dealing with a 
matrix of proportional correlations, a value greater than 1.00 may be 
required to make a communality proportional to the covariances in other 
columns. Since a communality greater than 1.00 is impossible with corre- 
lational data, the rank of the matrix must be greater than one. In that case 
more than a single factor is required to explain the intercorrelations. 


TABLE 17.1. ARTIFICIAL MATRIX GENERATED BY ONE COMMON FACTOR 


[Each correlation (below the diagonal) is precisely equal to the factor product (above 
the diagonal).] 


VARIABLE 2а 1 2 3 4 5 6 

Ga Vou а а a а а а 

1 а\ hı? ааз ааз аа 41045 4146 
2 az rı2 А2 агаз аза азау аза 
3 as ris — res h3 азаа asas аз 
4 da па ота Ға ha? ваз алав 
5 a5 ris rəs ras ras hs? asas 
6 a6 rie r2 386 rae rse Ле? 


Numerical Representation 


VARIABLE Ga 1 2 = 4 5 6 

9а 1.00 60 .20 —.50 —.40 70  ..50 
1 60 36 .12 —.30 —.24 42 30 
2 20 12 .04 —.10 —.08 14 10 
3 —.50 —.30 -10 25 20 —.35 — 25 
4 —.40 —.24 —.08 .20 46 —.28 -20 
5 70 42 44-35 —.28 49 35 
6 50 30 10-25 -20 435 25 
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For the data in Table 17.1 it can be seen that by matrix multiplication 
described in Chapter 16, the unrotated and rotated solutions reproduce 
the matrix equally well: 


.60 
.20 
Rp “60 .20 —.50 -.40 .70 .50| 
70 
.50 
48 —.36 
16 —.2 
_|--40  .30 | 48 16 —.40 —.32 .56 .40 
—.32  24||-.36 -.12 50 24 —.42 E 
56 —42 
40 —.30 


SOME LIMITATIONS ON THE FACTOR ANALYTIC METHOD 


In most forms of statistical analysis the problem exists of generalizing 
from the observed sample to the population it represents. This problem 
appears in factor analysis, but it is compounded by two sources of in- 
determinacy in the study of the sample itself. 

In computing means, standard deviations, differences, correlations, 
and other descriptive statistics for samples, answers are unequivocal. In 
factor analysis, however, what is analyzed for each variable is the common 
variance or communality, generally placed in the diagonal of the corre- 
lation matrix. A lower limit is known to be the square of multiple A; 
that is, the proportion of the variance of the variable predictable from all 
other variables in the matrix and for which a precise numerical value can 
be found. The upper limit of the communality is the reliability coefficient, 
or proportion of nonerror variance, but this coefficient, like the com- 
munality itself, is necessarily an estimate rather than a precise descriptive 
statistic. 

Especially with small matrices, results of a factor analysis will vary 
with how the diagonal cells are filled, and only in special cases are com- 
munalities precisely determined. 
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The second source of indeterminacy is that, when the rank of a matrix 
is greater than one, there is an unlimited number of F matrices, which 
mathematically are just as adequate in reproducing the original correlations 
as the F matrix first derived. By rotation of the axes of projection, any 
multiple-factor solution can be modified to any number of equivalent 
solutions. 

Accordingly, factor analysis as applied to psychological data is an art 
that is somewhat dependent on the skill and intuition of the investigator. 
It is not a series of precisely defined procedures yielding a rigorously 
objective result. 


BASIC EQUATION OF FACTOR ANALYSIS 


Let Ro be an observed correlation matrix in which, by some method or 
other, communalities have been inserted in the main diagonal. Let R be 
an approximation of Ко obtained by multiplying the matrix Е by its 
transpose. Then the basic equation of factor analysis is 


ЕЕ = В (17.10) 
in which R = Ro. 
Certain characteristics of R may be noted: 


1. The rank of R is the same as the rank of F. If F has » columns, then 
т operations of pivotal condensation in R will reduce all remaining 
elements to .00. 

2. Since the determinant of R is .00, it has no inverse. 

3. Each diagonal element, as a communality, must be the variance 
common with the other variables in the matrix. 

4. With real data, the rank of Rọ will, in general, be greater than that 
of R. We have no reason to believe, however, that any set of communalities 
fitting Ro are, in truth, unique. Consequently, we are at liberty to alter the 
factor loadings in F in any way that will produce off-diagonal elements of 
R approximating the off-diagonal elements of R, as closely as possible. 
In the interest of parsimony, a low-rank solution is considered better than 
а high-rank solution. If our factors are designated as ga, бь Gm with 
Weights for variable i of a;, b; + т, then by extension of Eqs. 17.1 through 
177. 


Zi = аф + bigo + + MIm + Wil; (17.1a) 
and 
aj +b + +m? + н =1 (17.3a) 
in which 
аё +b? + +m? =h? 
and 


ri; = аа; + bib; + + mmj (17.7a) 


462 AN INTRODUCTION TO PSYCHOLOGICAL STATISTICS 


That is to say, the communality of any variable is the sum of the squares 
of the factor weights and any correlation is the sum of products of the 
factor weights. 

These facts are summarized more succinctly in Eq. 17.10. 


METHODS OF FACTOR ANALYSIS 


Detailed descriptions of the numerical methods that have been used to 
develop an F matrix are beyond the scope of this text. No single method 
has yet emerged as universally acceptable as the best. The advent of 
electronic computers has made possible the use of factor extraction and 
rotation methods that would be prohibitive with desk calculators. 

Whatever method of developing F is used, there are three bases for 
evaluating the factor matrix as an analysis of the original matrix of 
correlations. 

1. How closely does R approximate Rọ? Ro —R is the “residual” 
matrix consisting, except for the diagonal elements, of partial covariances 
between the original variables, with the variability associated with the 
several posited factors partialed out. The aim of factor analysis is to make 
these residual coefficients so small that they can be taken as chance 
deviations from zero. 

2. The solution should be of low rank; that is, the number of factors, as 
represented by the number of columns in F, should besmall. Obviously, this 
requirement, coupled with the requirement of small residuals, contributes 
to the indeterminacy of factoring. As the rank of F goes up, the approxi- 
mation of R to Ry becomes better and better. A low-rank solution may 
require compromises in the size of the residuals, while a good fit of R to Ro 
may require a goodly number of factors. . 

3. The final requirement, originally proposed by Thurstone, is for 
simple structure. This means that the F matrix, as extracted or as subse- 
quently rotated on m axes, must have high loadings in “marker” variables 
and low positive or zero loadings in other variables. If all elements in F 
are zero or positive, the factor solution is said to have positive manifold. 


FITTING A SINGLE FACTOR 
An observed matrix of correlations can generally be explained as arising 
from a single, common factor if the elements in different vectors are more 
or less proportional; that is, if all pairs of vectors have correlations 
approaching 1.00.! A formula developed by Spearman yields the factor 

weights: 

(Er)? — Eri? 
fyc ДЕШЕ 2 CHE izjzk 17.11 
кг" Бы ата) 


1 Mathematically, this is the same as the requirement that the tetrads be .00. 
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in which ХХғ, is the sum of all the correlations in the matrix, each counted 
once, and Xr;; is the sum of the vector of r's involving i. The square of 
each r;, would be the communality, but in this special case of factor 
analysis, communalities are not needed. 

The procedure is to: 

1. Find the vector of factor weights; and 
2. Remove the variability of this general factor from the matrix by 
partialing it out. 

If the resultant partial covariances approximate zeroes, then a good 
fit has been secured. Actually, a decision as to whether or not the residual 
covariances can be attributed to chance requires that they be converted 
into partial r's and that their distribution be compared with the standard 
error of a partial r of .00, for the appropriate number of degrees of freedom. 
Exercise 2 at the end of this chapter requires a simple analysis of this sort. 


DIAGONAL OR SQUARE ROOT METHOD 
When the rank of a matrix is m and the communalities of m variables, 
loaded in m factors, are known precisely, the diagonal or square root 
method is applicable. 
Factor loadings are derived as follows: 
1. The loading of the “pivot” variable, the common variance of which is 
to be partialed out of the matrix, is the square root of its communality. 
If one or more factors have already been extracted, the loading is the 


Square root of the residual communality. | | 
2. For other variables the loading is the covariance with the pivot, divided 


by the loading of the pivot. Thus, for variable j, when variable i is 
being partialed out, the loading is Cil | hj. 

A process akin to the pivotal condensation used in multiple and partial 
Correlation reduces all elements in the vector of the pivot variable to zero, 
and reduces elements in other vectors to partial variances and covariances 
of higher order. The process is continued until all remaining elements 


approach zero. 

The method works well with artificial matrices in which communalities 
are known. With real data it is limited by the fact that the communality 
estimates generally available are not sufficiently precise to make the 
method useful. Occasionally, however, when good communality estimates 
are available, the method is feasible. The method can be used with the 


data of Exercise 3 at the end of this chapter. 
COMPUTATIONAL METHODS IN MULTIPLE-FACTOR ANALYSIS 


Practical procedures for finding an F matrix that can be taken as the roots 
of an observed matrix of correlations and for rotating the obtained factor 
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solution to a meaningful configuration, either on orthogonal or oblique 
axes, are beyond the scope of this text. Of the methods available, several 
will be mentioned briefly. Surveys of existing methods are presented by 
Harman (5) and Fruchter (4). 

| The bifactor method was developed by Holzinger to extend Spearman’s 
single-factor formulation to the case in which there is a large general 
factor, followed by "group" factors in limited clusters of variables. 
Characteristically, the obtained loadings are as shown in Table 17.2. The 
use of this method is described by Harman (5). 


TABLE 17.2. TYPICAL FACTOR SOLUTIONS 
CENTROID AND PRINCIPAL 


VARIABLE HOLZINGER BIFACTOR AXES (UNROTATED) 

FACTORS FACTORS 

1 II II «+=: I II Ill a 
1 + + 0-0 + + - 
2 + + 0-0 о жш 
3 + + 0-0 6 
4 + 0 + 0 tot 
5 + 0 +з 0 СОУСЕ: 
6 + 0 +: 0 to à 
7 + 0 0:4 +o + +c- 
8 + 0 о. + Е 
9 + 0 Or + Е: 

— = Negative loading 0 = Zero loading 


+ = Positive loading 


One of the most popular of the multiple-factor methods has been the 
centroid solution, developed by Thurstone (11). It involves filling the 
diagonal cells by making estimates of the communalities; a simple formula 
for factor weights; extraction of a first factor; “reflection ” of vectors until 
a second factor can be extracted; continuation of the process until the 
variance of the matrix has been exhausted; iteration of the communalities 
by repeating the extraction procedure; and rotation of the F matrix until 
the solution is meaningful. Characteristically, with psychological variables 
with positive intercorrelations, the first factor as extracted has positive 
loadings in all variables, and subsequent factors as extracted are “ bipolar,” 
with approximately half of the loadings positive and half negative. This 
characteristic pattern is shown in Table 17.2. Thurstone insists, however, 
that factors as extracted have little or no meaning, and only by rotation 
can an alternate solution be obtained that is in any way useful. 

A more sophisticated alternate to the centroid method is the principal 
axes solution, the forerunner of which was envisaged by Pearson (9) 
and which was developed as a factor method by Hotelling (6) and Kelley 
(7). In effect, a set of axes is fitted to the swarm of points representing 
variables in n-dimensional space in such a manner that maximum variance 
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is extracted. The arithmetical work required with any substantial number 
of variables is such that the method is feasible only with high-speed com- 
puters. Characteristically, the original solution is much like that of the 
centroid and becomes interpretable only through rotation. 

А mathematically sophisticated solution to factor analysis is the 
maximum likelihood method, developed by Lawley (8). This also is 
difficult computationally. For a given rank of the factor matrix, it develops 
à matrix of factor loadings such that the sum of the squares of the differ- 
ences between original correlations and correlations estimated from the F 
matrix is as small as possible. 


METHODS OF ROTATING THE F MATRIX 

In the original development of multiple factor analysis, graphic methods 
were employed to rotate the several axes of projections until a meaningful 
factor configuration was obtained. Beginning with the work of Carroll (1) 
іп 1953, several methods, feasible with computers, have been developed 
by which axes can be rotated analytically, with resultant parsimonious 
description of the factors. The analytic methods yield either uncorrelated 
factors (projected on orthogonal axes) or correlated factors (projected on 


oblique axes). 
READING A REPORT OF A FACTOR ANALYSIS 


The psychologist is often confronted with a report of a factor analysis, 
Part of which is almost always a table of factor loadings such as the 
one shown in Example 17.1. 


EXAMPLE 17.1 


READING THE REPORT OF A FACTOR ANALYSIS 


Purposes of a Factor Analysis. A factor analysis such as the one reported by 
Fleishman and Ellison (3) aims to reduce the number of variables needed to 
Teproduce the off-diagonal elements of a correlation matrix from n, the number 
Of observed variables, to n’, the number of factors (п > n’). If, by matrix multi- 
Plication and with negligible error, the n(n + 1)/2 r's сап be reproduced 
from the zr factors, the matrix may be considered factored. Most factor analysts 
use three other criteria of adequate factorization: 


1. n' must be as small as possible and appreciably smaller than л; 

2. Each factor must be defined by high correlations with some observed 
variables, low or zero correlations with others; and 

3. To the maximum extent possible, the factor matrix should have positive 
manifold; that is, factor loadings should be positive or zero. 


A factor analysis is regarded not only as a means of understanding the com- 
position of observed variables, but also often as a step in making decisions about 
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new variables that need to be developed, sometimes as purer measures of factors 
appearing in the analysis. 

Steps in the Factor Analysis. In the study summarized in Table 17.3, one of 
a series concerned with the isolation and definition of factors in manipulative 
and other psychomotor tests, 760 airmen were given a battery yielding 9 printed 
test scores and 13 scores from apparatus tests. After the computation of the 
231 intercorrelations, Thurstone’s centroid method was used to extract 7 factors. 
The largest residual? was .05, indicating that the fit of FF? to R, is good. As 
would be anticipated (see Table 17.2), the loadings of the unrotated factors 
beyond the first were about equally divided into those with positive and those 


with negative signs, as follows: 


FACTOR I II ш IV V VI VII 
Positive loadings 22 9 12 13 12 11 9 
Negative loadings 0 13 10 9 10 11 13 


Results. Rotations of orthogonal axes resulted in the solution presented as 
Table 17.3, in which factors are identified at the tops of the columns. Important 


points are: 


1. There are only three large negative loadings іп F, and in each case another 
score from the same test has a large positive loading in the same factor. Hence 
F has positive manifold. 

2. Of the 154 factor loadings, 36 have an absolute value of .30 or more, 
which the authors have taken as the lower limit of significance. 

3. The communalities in the last column are the squares of the factor loadings 

of the variable. They may be taken as the proportion of the variance explained 

by the common factors. The complement of A? is и, the proportion of unique 

variance, including both specific reliable variance and error. 

The original matrix of r’s can be reconstituted, except for discrepancies or 

residuals and within rounding error, by multiplying F by its transpose. Here 

are a few correlations and residuals as reported in the original article, to- 

gether with corresponding r’s as reconstructed from Table 17.3. 


ORIGINAL f RESIDUAL RECONSTRUCTED r 


rie .83 .04 .80 
ria 457 .00 .58 
та 57 00 57 
roa 57 .00 .58 
r24 57 .00 157 
r34 174 —.01 76 


2 The term residual as used in factor analytical literature refers not to a residual 
variable nor to a value of such a variable, but rather to the partial covariance between 
two variables after partialing out the factor(s). 
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In reading a factor study, cognizance may be taken of the following 
points: 


1. To what degree does the F matrix reproduce the original matrix of 
correlations? This is never apparent directly from the F matrix, but is 
usually indicated somewhere in the text of the article. Often it is instructive, 
when axes are orthogonal, to attempt to approximate some of the original 
(78 by matrix multiplication. Obviously, the closeness of fit of FF” to Ro is 
the first criterion of successful factorization. 

2. How many factors have been found? That is to say, what is the rank 
of F? Here, with empirical data, there is always some sort of a com- 
promise. If the rank of Rg approaches n, the number of variables, then 
FF? can equal Ry exactly. In general, however, the aim of factor analysis 
is to reduce the number of dimensions and still permit FF" to approximate 
Ro. 
3. As projected on rotated orthogonal or oblique axes, do the loadings 
of F make sense? What observed variables help in the interpretation of F, 
and what loadings in the several factors help in understanding the ob- 
served variables? Typically, factor analysis involves the use of marker 
variables to define factors, and uses factors to understand variables hitherto 
less well understood. In Table 17.3, 22 conventional psychological tests 
have heavy loadings in 7 factors, in terms of which various training criteria 


can be interpreted. 


SUMMARY 


Modern factor analysis stems from the work of Spearman, who found 
evidence for a single factor underlying mental tests. When it was realized 
that his two-factor theory (that is, one general factor plus specific factors in 
each variable) was inadequate to explain the matrices of correlations 
actually obtained in psychological investigations, Thurstone and others 
developed multiple-factor methods. These yield two or more columns of 
factor weights by which the original r’s can be reproduced more or less 
exactly. 

Since an indefinite number of alternate multiple-factor solutions always 
exist, the principle of simple structure is generally used as the basis of 
choosing the most suitable configuration of the rotated factor matrix. 

Despite continued development of methods, it must be emphasized that 
factor analysis is far from being an exact set of procedures for data 
reduction and for drawing inferences about the parameters of an unknown 
population. In the hands of skillful research workers, however, it appears 
useful in the interpretation of a large mass of observations and in suggesting 
new avenues of exploration. 
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EXERCISES 


1. By matrix multiplication generate a single factor matrix with known com- 
munalities from the following F-matrix: 


т 


ЕТ 
|8 522556) 


VARIABLE 


ч шо юе 
союл 


Consider the resultant R matrix: 


(a) Are the entries in each vector proportional to the entries in any other 
parallel vector? 

(b) Test any off-diagonal 2 x 2 minor. Does it equal zero? 

(c) Test a 2 x 2 minor that includes one or two communalities. Does it equal 
zero? 

(d) Find an /4? from Formula 17.8a, №2 =ryrix/rsx. 15 the correct value 
recovered? 


2. The following are intercorrelations of five reading skills as reported by 
Davis (2). Find single-factor loadings by Formula 17.11, partial out the 
factor, and inspect the residuals to see if they are sufficiently small so that the 
assumption of a single factor in the original matrix is plausible. 


VARIABLE 2 З 4 5 
1 Мога meanings 41 32 -68 .68 
2 Following organization 34 42 41 
3 Answering questions 55 155 
4 Drawing inferences 68 
5 Inferring writer's characteristics 


3. The following hypothetical two-factor matrix is from Thurstone (11): 


VARIABLE 2 3 4 5 
1 .50 ES .30 21 
2 58 44 34 
3 54 157 
4 62 


The generating F matrix, with communalities, is shown below, at the left. 
At the right is a set of factor weights derived by the square root method. 
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In general, distribution-free tests require relatively simple arithmetical 
operations (counting or ranking) and the use of a table based on the 
distribution of the statistic, which may vary with N, or the N’s of subclasses, 
or with the degrees of freedom. 

When original data are in the form of frequencies or ranks, statistical 
tests necessarily fall in the distribution-free classification. When interval or 
ratio data are converted to the frequencies or ranks required by the 
typical distribution-free tests, there are two consequences: 


1. Some of the information gathered as original observations is simplified 
and thus disappears; and 

2. When the null hypothesis is in fact false, there is less probability of 
rejecting it. 

For identical data, then, these short-cut and approximate methods of 
making statistical tests are less powerful than conventional procedures. 
Even when the assumptions needed for parametric tests are not fully 
justified, ż and F are often preferable to a distribution-free method. 
However, y? is one of the most versatile devices in the armamentarium of 
hypothesis testing, while various other distribution-free techniques are 
finding increasingly numerous applications with psychological data. An 
attractive feature is that they often permit valid inferences from small 
numbers of cases. Characteristically, the statistic itself is constructed 
intuitively, after which its distribution is worked out mathematically 
by procedures such as those already demonstrated in Chapter 12. When a 
large number of individual distributions are involved, the published tables 
generally give only selected significance points, such as those exceeded by 
05 and .01 of the chance distribution of the statistic. 


RANKING METHODS 


SPEARMAN'S p AS A DISTRIBUTION-FREE STATISTIC 


A statistic that does not assume normal distributions for the two variables 
in the parent population, but rather forces rectangular distributions on 
all sets of data, is p, Spearman’s coefficient of rank correlation, the formula 


for which was given in Formula 4.5 as 
1 бұр? 
р ке ее RN ы) 
Before using р to test the null hypothesis of no association between the 
two variables in the population represented by the sample, its distribution 
must be known. As implied by the display in Table 18.1, there is a separate 


distribution of p for each value of N, the number of pairs of observations. 
If two variables, X and Y, each consist of ranks from 1 to N, and X is 
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arranged in a fixed order, there are factorial N (that is, N!) ways of 
arranging the values of Y to match those of X. The N! values of p computed 
from the N! ways of arranging Y constitute the distribution of р. Distribu- 
tions of p from N =2 to N = 5 аге given in Table 18.1. 


TABLE 18.1. DISTRIBUTIONS OF р FROMN=2TON=5 


N=2 N=3 N=4 N=5 
P f p f p f p f p 
1.00 1.500 1 467 1 .042 1 .008 
.90 4 .033 
.80 з 425 3 025 
-70 6 .050 
-60 1 .042 7 1058 
:50 2 .333 6 .050 
40 4 167 41033 
30 10 .083 
20 2 .083 6  .050 
10 10 .083 
.00 2 .083 6 .050 
—.10 10 .083 
—.20 2 .08 6 .050 
—.30 10 .083 
—.40 4 167 4 .03 
—.50 2 333 6 2050 
—.60 1 02 7 .058 
—.70 6 .050 
—.80 3 125 8 025 
—.90 4 .033 
—1.00 1.500 1 167 1 .042 1 .008 
N!=2 N!=6 N!=24 N!—120 


Each distribution consists of N! frequencies, and the p for each possible 
value of p is f divided by N!. The p are not cumulated, as would be neces- 
Sary in developing entries for a table of significance points. It can be seen, 
however, that: 

l. If N — 4, a p of 1.00 is significant at the .05 level (more precisely, the 


-042 level). н 
2. If N= 5, a p of 1.00 is significant at the .01 level (more precisely, the 


-008 level). 
3. If N=5,a p of .90 is significant at the .05 level (more precisely, the 


5/120, or .042 level). 


In testing an observed p, опе may use the same .05 and .01 levels of 
Significance as are often used with / and F. Table 18.2 gives (for values of 
N from 4 to 30) values of p needed for significance at the .05 and .01 levels 
(one-tailed tests). A p computed for a given N is merely compared with the 
Size of the p needed at a predetermined level of significance. It is seen that 
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results of acceptable significance can sometimes be obtained for as few as 
four or five cases. 


TABLE 18.2. CRITICAL VALUES OF p AT THE .01 AND .05 LEVELS OF SIGNIFICANCE? 
(One-Tailed Test) 


N 
LEVEL OF 
SIGNIFICANCE 4 5 6 v 8 9 10 12 14 
.01 1.000 .943 .893 .833 .783 .746 .712 .645 
.05 1.000 .900 .829 .714 .643 .600 .564 .506 .456 
N 
LEVEL OF 
SIGNIFICANCE 16 18 20 22 24 26 28 30 
01 (601 .564 .534 .508 .485 .465 .448 .432 
05 (425 .399 .377 .359 .343 .329 .317 .306 


а From Olds, E. G., 1938. "' Distributions of sums of squares of rank differences for small numbers of in- 
dividuals,” Ann Math. Statist., 9, 133-148; and from Olds, E. G., 1949, “Тһе 5 percent significance levels 
for sums of squares of rank differences and a correction," Ann. Math. Statist., 20, 117-118, with the kind 


permission of the publisher. 


THE RUN TEST! 

Occasionally an investigator is interested in whether a sequence of two 
types of events can be considered random. No parametric test is available 
for this purpose. The run test can be used to evaluate whether an observed 
number of unbroken “runs” within a total series is within chance expec- 
tancy. It is applicable both to ranks and to sequences in time. 

Let the п events in one category be X;, Ху X, and the m events in the 
other category be Уі, Y; Ym The total number of events is n + m = N. 
The first step is to consolidate the two categories in a single series and to 
count d (the number of runs), each of which, in the artificial sample below, 
is enclosed in parentheses: 


(XX AY 00GX4X (Y 2 YAX AY JOGX8X oX (Y s Y o) 

(+ +у—)(+++)0— —)(+)(—)(+ ++ +)(—-) 
(X 11X12 Vr Ys Vo Уло Ү 20 Х (Y 12) 
(+ # eee —J)UE RES) 


X’s and Y’s may have numeric values, allowing them to be placed in 
order, or they may be ranks within the total series, or they may be the 
presence or absence of a characteristic (as indicated by the + and — signs 
in the second line). 

Неге, d = 12, n = 14, m = 12, and N = 26. 


1 Known also as the Wald-Wolfowitz run test. 
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Strictly, when п and т are less than 20, deviation from randomness is 
evaluated by special tables.? However, when z and m are 10 or more, the 
procedure described below and used routinely when л and т are 20 or 
more is reasonably accurate. 

A decision has to be made as to which of the following conditions will 
be the basis for rejecting the null hypothesis: 


1. Fewer runs than anticipated by chance (one-tailed test); 
2. More runs than anticipated by chance (one-tailed test); or 
3. Either more or fewer runs than anticipated by chance (two-tailed test). 


Let us decide to reject the null hypothesis at the 5 percent level of 
confidence if the runs are fewer in number than would be anticipated by 
chance. Thus, the strings or runs of like variates would be longer than 
expected on the basis of the null hypothesis. For large values of and m, 
the chance distribution of d is normal, with the following mean and stan- 
dard deviation: 


2mn 


-----і 
Ha N 

€ 2mn(2mn — N) 
AUN N(N-D 


To reject the null hypothesis under the conditions chosen, (d — шл)/са 
must be — 1.645 or less (that is, farther from the mean, with the same sign 
and greater absolute value). Here, 44 = 12.92, б, = 2.48. However, since 
d is 12, (d — ш)/с is only —0.37; therefore there is no reason in this case 


to reject the null hypothesis. 


OTHER TESTS BASED ON RANK ORDER 


Some of the distribution-free tests based on ranks parallel in intent 
conventional parametric tests for differences between means. However, 
means as such are not computed (but may be reflected in sums of ranks), 
and information as to variability comes from numbers of cases ranked. 

The matched-pairs, signed-ranks test evaluates differences between two 
correlated variables. It was developed by Wilcoxon (13) who presents a 
table for use when N, the number of pairs, is 25 or less. 

Briefly, differences between the N pairs of values are computed and 
ranked, retaining the algebraic signs of the differences. The totals of 
positive and negative ranks are found separately, and the smaller sum of 
ranks, T, is evaluated either by the table or by computation. When N is 


? Tables for small values of and m were developed by Swed and Eisenhart (10) and 
are reproduced by various authors, including Siegel (7) and Tate and Clelland (11). 
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25 or more, T is considered to be normally distributed with mean of 
N(N + 1)/4 and standard deviation of J N(N + 1)(2N + 1)/24. According- 
ly, (T — ит)/тт can be readily evaluated from a table of the area of the 
normal curve, such as Table A (Appendix). 

The U test of Mann and Whitney (5) is applicable to differences between 
unmatched groups of any size. It is an extension of the T test for the 
special case of two unmatched groups of identical size, developed by 
Wilcoxon (13). The procedure involves ranking in a single series from low 
to high the n observations in one group. Two sums of ranks are then 
obtained: T, for the group of п cases and Т, for the group of m cases. 
Then U is the smaller of the following quantities: 


n(n + 1) 


m+1 
ү ка чанар! апа nig DECEM y 


2 m 


U is evaluated by tables? developed by Mann and Whitney. Again, as 
n and m increase in size (that is, beyond 8), U approaches normality, with 
mean of nm/2 and standard deviation of \/nm(N + 1)/12, in which 
n + т = №, the total number of observations. The rarity of an obtained U 
can then be evaluated with the use of a table of the area of the normal curve, 
and the null hypothesis may be rejected or accepted at a predetermined 
level of confidence. 

In addition to the tests that parallel the critical ratio and t, methods of 
analysis of variance with ranked data have been developed. Kruskal and 
Wallis (4) have written on the one-way case; Friedman (2), on the two-way. 
These methods are useful when original observations must be in the form 
of ranks. 


TESTS BASED ON ALGEBRAIC SIGNS OF DIFFERENCES 


THE SIGN TEST 


The sign test, one of the most widely used distribution-free techniques, 
employs the binomial distribution (described in Chapter 12) to test 
whether the median difference between pairs of observations is zero. If the 
original observations are in the form of interval data, less information is 
used in the sign test than in the matched-pairs, signed-ranks test, which 
also tests differences between paired observations. In the sign test, only 
information as to the direction of the differences within pairs is used. 
Two sets of matched observations are compared, pair by pair, and the 
sign of each difference is noted as plus or minus. (If both members of a 
pair are equal, it is the custom to drop that pair from further consideration.) 


3 These tables are available in various texts in statistics, as well as in the original 
reference. 


DISTRIBUTION-FREE STATISTICAL TESTS 477 


If the median difference in the population were zero, half of the signs, 
within sampling error, would be expected to be plus and half would be 
minus. 

The chance distribution of plus and minus signs for the null hypothesis 
of a median difference of zero is given by the binomial expansion (p + 4)", 
in which N is the total number of pairs yielding differences and in which 
p =q = .50. 

There are four ways of evaluating a situation in which X of N pairs 
yield one sign and (N — X) the other: 

1. By Formula 12.1, the exact probabilities of N, (N — 1) --- X like 
signs can be found and summed. If this sum is equal to or less than the 
proportion corresponding to the predetermined significance level, the 
null hypothesis may be rejected at that level. Thus, if X = 2 апа N = 7, the 
sum of the probabilities of X — 0, 1, 2 is .013, to which would be added 
the probabilities of ¥ = 5, 6, 7 (also .013) if a two-tailed test is appropriate. 

2. The binomial proportions may be read directly from a table such as 
that published by the National Bureau of Standards (6). Less extensive 
tables are available in various texts, including Walker and Lev (12). 

3. As pointed out in Chapter 12, the binomial becomes a better and 
better approximation of the normal curve as p and q approach equality 
and as the exponent applicable to (p + q) increases. Here p =q = .50 and 
the exponent is N. Accordingly, by Formula 12.2, the mean is Np, or SN; 
and by Formula 12.3, the standard deviation is X Npq, or .5-/ №. 

If N is large (say, 25), Х сап be converted to conventional z-score form 
and evaluated by Table A (Appendix). Thus, 

Х-ш Х-.5М 


T NES. т. 


A simple correction for continuity, discussed by Siegel (7), improves the 


approximation. | | 
4. When N is 10 or more, the obtained distribution of plus and minus 


signs can be compared with a theoretical distribution of equal numbers of 
plus and minus signs, using x? with 1df, as in the diagram below: 


X? APPLIED TO THE SIGN TEST 


OBSERVED THEORETICAL 
FREQUENCIES FREQUENCIES 


E 


jos E 
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Preferably, Yates’ correction for continuity (demonstrated in a some- 
what different context in Example 3.7) is used. This involves subtracting .5 
from the greater f; and adding .5 to the lesser fo. 


MEDIAN TEST 


The median test resembles the U test in that it compares two groups that 
may be of unequal size and in which there is no matching of cases. However, 
only algebraic signs of differences between each case and the median of the 
combined groups are considered. 

The n +m cases of X,, X; -- X, апа Y,, Y; + Y, are ranked in a 
Single series to find their median. Then the frequencies of X and of Y 
above the median constitute f, , and f, + in the following diagram: 


- + 

Group X Ж- | Ба n | 
Group Y f- fu m | 
D» fe N | 


For small numbers of cases, Fisher (1) provides an exact test of signifi- 
cance. If n and m are sufficiently large so that each fe is 5 or more, the 
2 x 2 table can be evaluated for independence by the y? test with 1df. 


OTHER DISTRIBUTION-FREE STATISTICS 


The coefficient of contingency C, considered in Chapter 2, and Kendall's т 
and percentiles discussed in Chapter 4 are among the statistics generally 
classed as distribution-free when used as a basis for significance tests. 
Methods also exist for finding the confidence limits of percentiles; for 
testing randomization of values within a set of observations; and for testing 
the degree to which deviations from some posited value are greater than 
might be expected by chance. 

When data can be collected only in the form of frequencies or ranks, the 
investigator should consider the use of an appropriate distribution-free 
method. This discussion of a relatively small number of tests is merely an 
introduction to the subject. The presentation has been simplified and, in 
particular, methods of treating ties have been largely omitted. These are 
given in the references. 


SUMMARY 


Several of the statistics underlying distribution-free, or nonparametric, 
tests were presented in Chapters 2 and 4, while some of the logic applicable 
to them was given in Chapters 3, 10, and 11. 
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In this chapter, comments on the distribution of p illustrate the three 
Stages characterizing an inferential statistic applicable to categorical 
data or to ranks (or to interval data reduced to categories or ranks). 


1. A descriptive statistic is developed, often intuitively ; 

2. Its distribution on the basis of a null hypothesis is developed; and 

3. In dealing with an appropriate sample of cases, the research worker 
decides on the level of significance that he will use for rejecting the null 
hypothesis. He then computes the statistic and evaluates its rarity on the 
basis of the known distribution. This paradigm, of course, follows that 
of conventional statistical tests. 


EXERCISES 


1. For the 24 oldest children in a village school, Spearman (8) reports ranks as 
follows: 


INTELLECTUAL INTELLECTUAL 
RANK RANK 
OUT OF IN OUT OF IN 
SEX SCHOOL SCHOOL SCHOOL SCHOOL 
F 6 2 F 4 8 
M 11 22 Е 9 14 
Е 16 7 Е 15 10 
F 1 1 M 17 17 
M 3 8 м 22 5 
Е 10 9 Е 14 15 
Е 8 12 M 19 24 
F 2 6 M 18 16 
M 5 П м 23 20 
м 21 19 м 24 23 
Е 12 4 Е 7 13 
Е 13 18 M 20 21 


(a) Test whether the association between the two sets of ranks is significant 


for the total groups "E Ses 
(b) Rank males and females separately and test whether the association is 


significant in each of the two subgroups. | 
2. Drawings by 23 grade school pupils were ranked in order from 1 to 23. 
when identities were revealed, it was found that drawings with the following 


ranks were by boys: 
2, 5, 6, 7, 9, 10, 15, 16, 17, 23 


By the U test, determine at the 5 percent level of significance whether boys 
and girls differ in the quality of their drawings. 

- The sequence below represents 57 individuals observed walking past a certain 
point on a university campus during an 8-minute period. Males are repre- 
sented as 1, females as 0. Apply the run test to the data to decide at the 
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5 percent level of confidence whether the number of unbroken sequences 
differs from what might be expected by chance. 

100110011011000001100111110000101010100011001001010101000 


4. Sugisaki and Brown (9) report numbers of women's and men's pictures 
recognized by six Japanese girls and nine Japanese boys as follows: 


NUMBER OF PICTURES 


RECOGNIZED 
OWN OPPOSITE 

GIRLS SEX SEX 
Fujiwi 172 142 
Takagi 134 131 
Ishibashi 180 163 
Morimoto 161 162 
Towata 149 139 
Sano 155 130 
BOYS 

Fukuzawa 170 159 
Tsuchiya 131 125 
Towata 114 101 
Miyanchi 114 110 
Yamagawi 111 122 
Joseph 115 124 
Nakao 119 128 
Kiro 78 89 
Takeo 81 114 


(a) Considering “Own Sex" and “Opposite Sex" as the two variables, apply 
Formula 13.13 to test whether the difference between the two means is 
significant at the .05 level. 

(b) Apply a distribution-free technique to determine whether the difference 
is significant at the .05 level. 
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APPENDIX 


TABLE A 


TABLE C 
TABLE E 
TABLE Е; 
TABLE F; 
TABLE O 
TABLE P 


TABLE R 
TABLE SQ 
TABLE SR 
TABLE T 
TABLE V 
TABLE Z 


Proportions of the Area Under the Normal Probability 


Curve 
Distribution of x? 
Values of the Exponential е” М from M —.01 to M = .99 


The 1 Percent Points for the Distribution of F 

The 5 Percent Points for the Distribution of F 

Ordinates of the Normal 1 Probability Curve © 

Values of /p/g and /аїр and of y, the Ordinate of the 
Normal Curve 

Critical Values of the Correlation Coefficient 

Squares of Numbers from Л to 99.9 

Square Roots of Integers from 1 to 999 


Absolute Values of Student's г | 
Values of the Partial Variance in Relation to Multiple R 


Values of z, Corresponding to r from .000 to .999 
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TABLE C. DISTRIBUTION OF yx: 


P 

INNEN" 

df .50 20 10 .05 .02 .01 .001 
1 455 1.642. 2.706 3.841 5.412 6.635 10.827 
2 1.386 3.219 4.605 5.991 7.824 9.210 13.815 
3 2.366 4.642 6.251 7.815 9.837 11.345 16.266 
4 3.357 5.989 7.119 "9.488 11.668 13.277 18.467 
5 4.351 7.289 9.236 11.070 13.388 15.086 20.515 
6 5.348 8.558 10.645 12.592 15.033 16.812 22.457 
7 6.346 9.803 12.017 14.067 16.622 18.475 24.322 
8 7.344 11.030 13.362 15.507 18.168 20.090 26.125 
9 8.343 12.242 14.684 16.919 19.679 21.666 27.877 
10 9.342 13.442 15.987 18.307 21.161 23.209 29.588 
11 10.341 14.631 17.275 19.675 22.618 24.725 31.264 
12 11.340 15.812 18.549 21.026 24054 26217 32.909 
13 12.340 16.985 19.812 22362 25.472 27.688 34.528 
14 13.339 18.151 21.064 23.685 26873 29.141 36.123 
15 14.339 19.311 22.307 24.96 28.259 30.578 37.697 
16 15.338 20.45 23.542 26.296 29.633 32.000 39.252 
17 16.338 21.615 24.769 27.587 30.995 33.409 40.790 
18 17.338 22.760 25.989 28.869 32.346 34.805 42312 
19 18.338 23.900 27.204 30.144 33.687 36.191 43.820 
20 19.337 25.038 28412 31.410 35.0200 37.566 145.315 
21 20.337 26.171 29.615 32.671 36.343 38.932 46.797 
22 21.337 27301 30.813 33.924 37.659 40.289 48.268 
23 22.337 28.429 32.007 35.172 38.968 41.638 49.728 
24 23.337 29.553 33.196 36415 40.270 42.980 ` 51.179 
25 24.337 30.675 34.382 37.652 41.566 44.314 52.620 
26 25.336 31.795 35.563 38.885 42.856 45.642 54.052 
27 26.336 32.912 36.741 40.13 44.140 46.963 55.476 
28 27.336 34.027 37.916 41.337 45.419 48.278 56.893 
29 28.336 35.139 39.087 42.557 46.693 49.588 58.302 
30 29.336 36.250 40.256 43.773 47692 50.892 59703 


Note: For larger values of v, the expression V2y?— V2v—1 (in which v is the number of degrees of 
freedom) may be used as a normal deviate with unit variance remembering that the probability of y? cor- 
responds with that of a single tail of the normal curve. 


Source: Table C is abridged from Table IV of Fisher and Yates: "Statistical Tables for Biological, Agri- 


cultural and Medical Research," published by Oliver and Boyd Ltd., Edinburgh, and by permission of 
the authors and publishers, 
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TABLE E. VALUES OF THE EXPONENTIAL e~ FROM M= .01 TO M = .99 


In a Poisson distribution, M — c?. The probability of any value X, as given by 


Formula 12.7, is 
MX 
poe cy (X—-0,1,277) 


The following table gives values of е-М from M=.01 to M = .99, in which e is 
2.718 = , the base of the natural system of logarithms. 


M е-м М ем М ем 
PEE MEME MEE medius Um 
01 .990 34 712 67 512 
.02 .980 35 ‚705 68 .507 
.03 .970 .36 .698 .69 .502 
.04 .961 37 .691 70 .497 
.05 951 138 684 71 492 
.06 .942 “39 .677 42 .487 
.07 .932 40 .670 73 .482 
.08 .923 ES .664 74 477 
09 914 42 .657 75 .472 
10 905 43 .651 76 468 
11 896 44 644 77 463 
12 887 45 .638 78 458 
13 878 46 .631 79 454 
14 869 47 .625 80 449 
15 861 48 .619 81 445 
16 852 49 .613 82 440 
17 844 50 .607 83 436 
18 835 51 .600 84 432 
19 827 52 595 85 427 
20 819 53 .589 86 423 
21 811 54 .583 87 419 
22 803 155 1577 88 415 
23 795 56 571 89 411 
24 787 .57 .566 90 407 
25 :119 58 .560 91 403 
26 771 59 .554 92 399 
27 763 60 1549 93 395 
28 756 61 543 94 391 
29 748 62 .538 95 387 
30 741 63 .533 96 383 
31 733 64 .527 97 379 
32 726 65  .52 98 315 
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TABLE Fi. THE 1 PERCENT POINTS FOR THE DISTRIBUTION OF F 


NUMERATOR df 


DENO- 
MINATOR 
df 1 2 3 4 5 6 8 12 24 oo 
моз сугы жел ры Lc e emo ic pu A RES 
1 4052 4999 5403 5625 5764 5859 5982 6106 6234 6366 
2 98.50 99.00 99.17 99.25 99.30 99.33 99.37 99.42 99.46 99.50 
3 3412 30.82 29.46 28.71 28.24 27.91 27.49 27.05 26.60 26.12 
4 21.20 18.00 16.69 15.98 15.52 15.21 14.80 14.37 13.93 13.46 
5 16.26 13.27 12.06 11.39 10.97 10.67 1029 9.89 9.47 9.02 
6 1374 1092 978 915 875 847 810 772 7.31 6.88 
7 1225 9.55 845 7.85 746 719 684 647 6.07 5.65 
8 11.26 865 7.59 701 6.63 6.27 6.03 5.67 5.28 4.86 
9 10.6 8.02 6.99 642 6.06 5.80 547 511 4.73 431 
10 10.04 7.56 655 5.99 5.64 5.39 506 4.71 4.33 3.91 
11 965 7.20 622 567 5.32 507 4.74 440 4.02 3.60 
12 9.33 693 5.95 541 5.06 4.82 450 416 3.78 3.36 
13 907 6.70 5.74 5.20 486 4.62 420 3.96 3.59 3.16 
14 8.86 651 5.56 5.03 4.69 446 4.14 3.80 3.43 3.00 
15 8.68 6.36 5.42 489 456 432 400 3.67 3.29 2.87 
16 8.53 623 529 477 444 420 3.89 3.55 3.18 275 
17 840 6.11 5.18 467 4.34 410 3.79 3.45 3.08 2.65 
18 8.28 601 5.09 4.58 425 401 3.71 3.37 3.00 2.57 
19 818 5.93 501 450 417 394 3.63 3.30 2.92 2.49 
20 8.10 5.85 4.94 443 410 3.87 3.56 323 2.86 2.42 
21 802 5.78 487 437 404 3.81 3.51 3.17 2.80 2.36 
22 794 5.72 482 4.31 3.99 3.76 345 312 2.75 2.31 
23 71.88 566 4.76 426 3.94 3.71 341 307 2.70 226 
24 782 561 4.72 4.22 390 367 336 303 2.66 221 
25 777 557 468 418 3186 3.63 332 299 2.62 2.17 
26 7:2. 5853 244 14 382. 33159 820. 236. 258 23 
27 7.68 549 460 411 3.78 356 326 293 255 210 
28 764 545 457 407 3.75 3.53 323 290 2.52 2.06 
29 7.60 542 454 404 3.73 3.50 320 287 249 203 
30 7.56 5.39 4.51 402 3.70 347 317 284 247 201 
40 7.31 518 431 3.83 3.51 329 299 266 229 180 
60 7.08 498 413 3.65 3.34 3.12 282 250 212 160 
120 6.85 479 3.95 3.48 317 296 266 234 195 1.38 
© 6.64 460 3.78 3.32 302 280 251 218 179 100 


Note: In using this table, the greater mean square must be the numerator of F. (The 5 percent points for 
the distribution of F are on page 489.) 
Source: Table Fi is abridged from Table V of Fisher and Yates: “Statistical Tables for Bi 
cultural and Medical Research,” published by Oliver and Boyd Ltd., 
authors and publishers. 


i iological, Agri- 
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TABLE F;. THE 5 PERCENT POINTS FOR THE DISTRIBUTION OF F 


NUMERATOR df 
ESSEE 
DENO- 
MINATOR 
df 1 2 3 4 5 6 8 12 24 œ 


1 1614 199.5 215.7 2246 2302 2340 2389 243.9 249.0 254.3 
2 18.51 19.00 19.16 19.25 19.30 19.33 19.37 19.41 19.45 19.50 
3 1013 9.55 9.28 9.12 9.01 8.94 8.84 874 8.64 8.53 
4 7.71 6.94 6.59 639 6.26 616 6.04 5.91 STI 563 
5 6.61 5:39 $541 5.19 5.05 495 4.82 4.68 453 436 


6 599 514 4.76 4.53 439 428 415 400 3.84 3967 
7 559 4.74 435 402 397 3:87 3.73 357 341 323 
8 532 446 407 3.84 3.69 3.58 344 328 3.12 2.93 
9 512 426 3.86 3.63 3.48 3.37 323 307 290 2.71 
10 496 410 3.71 348 333 3.22 307 291 274 254 
11 484 3.98 359 336 3.20 309 295 279 261 240 
12 475 388 349 3.26 3.11 300 285 269 2.50 2.30 
13 467 3.80 3.41 318 302 292 277 260 242 2.21 
14 460 374 3.34 311 296 285 270 253 235 213 
15 454 3.68 329 306 290 279 264 248 229 207 
16 449 363 324 301 285, 274 259 242 224 2.01 
17 445 359 320 296 281° 270 255 238 219 1.96 
18 441 355 316 293 277 266 251 2.34 215 1.92 
19 438 352 313 290 274 263 248 231 211 1.88 
20 435 349 310 287 2.71 260 245 2.28 2.08 1.84 
21 432 347 307 284 268 257 242 225 2.05 1.81 
22 430 344 305 282 266 2.55 240 223 203 1.78 
23 428 342 303 280 264 2.53 238 220 200 1.76 
24 426 340 301 278 262 251 236 218 198 173 
25 424 338 299 276 260 249 234 216 196 171 
26 422 337 298 274 259 247 23 215 1.95 1.69 
27 421 3.35 296 2.73 257 246 230 2/33 193 1/67 
28 420 334 295 271 256 244 229 212 Іш 1465 
29 418 333 293 270 254 243 228 2.4.0 1.90 1.64 
30 417 332 292 269 253 242 227 209 189 162 
40 408 3.23 284 261 245 234 218 200 1.79 151 
60 400 315 2.76 252 237 225 2.10 1.92 1.0 1.39 
120 392 307 268 245 229 217 202 183 1.61 125 
© 384 299 260 237 221 210 194 175 1.52 1.00 


Note: In using this table, the greater mean square must be the numerator of F. (The 1 percent points for 
the distribution of F are on page 488.) 

Source: Table Fs is abridged from Table V of Fisher and Yates: “Statistical Tables for Biological, Agri- 
cultural and Medical Research,” published by Oliver and Boyd Ltd., Edinburgh, and by permission of the 


authors and publishers. 
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TABLE P. VALUES OF 1/p/g AND «/q/p AND OF y, THE ORDINATE OF THE NORMAL 


CURVE 


(p+q)=1 

p a Ура Vap у р 4 Ура Учар y 
(01.99  .1005 9.9499 (0267 151. .49 1.0202  .9802  .3988 
02 .98  .1429 70000 0484 52 .48 1.0408 .9608 «3984 
03 97 11759 5.6862 0680 13.47 1.0619 .9417 .3978 
4 .96 2041 4.8990 0862 154. .46 1.0835  .9230 .3969 
05 .95  .2294 4,3589 1031 155 дз 1.1055 .9045 .3958 
6 .94 2526 3.9581 1191 156 .44 1.1282 .8864 .3944 
07 .9 2744 3.6450 1343 157 2.43 141513 .8686 .3928 
(08 92  .2949 3.3912 1487 58 42 1.1751  .8510 .3909 
09  .91 .3145 3.1798 1624 1941 1.1996  .8336 .3887 
10 — .90 .3333 3.0000  .1755 60 40 1.2247  .8165 .3863 
“1,89 3516 2.8445 11880 61.39 1.2506  .7996 .3837 
2  .88  .3693 2.7080 .2000 62 38 1.2773  .7829 3808 
43 .87  .3866 2.5870 2115 3 .37 1.3049  .7664 .3776 
4 — .86 4035 2.4785 22226 (64 36 13333  .7500 23741 
15  .85 .4201 2.3805  .2332 65 39 18628 7338. „1704 
16 .84  .4364 22913 22433 66 .34 1.3933 .7177  .3664 
17 ..83 .4526 2.2096 2531 67 .33 1.4249  .7018 2362 
18 82  .4685 2.1344  .2624 68 .32 1.4577  .6860 .3576 
19 .81  .4843 2.0647 2714 69 31 1.4919 .6703 .3528 
20  .80 .5000 2.0000 .2800 70  .30 1.5275  .6547 2.47 
21 — .79 .5156 1.9396 2882 1  .29 1.5647  .6391  .3423 
22  .78  .5311 1.8829 2961 12 28 1.6036  .6236 .3366 
23  .77  .5465 1.8297 .3036 13 27 1.6443  .6082 .3306 
24  .76  .5620 1.7795  .3109 14 26 1.6871  .5927 .3244 
25:15 .5774 1.7321 23178 45. 125 4117321 5774 310% 
26 .74 .5927 1.6871 .3244 16 2.22 1.7795  .5620 3109 
27 173 6082 1.6443 2.3306 177 23 1.8297  .5465 .3036 
22:72 (6236 1.6036 .3366 178 222 1.8829  .5311 2961 
29  .71 6391 1.5647 3423 79 21 1.9396 .515 2882 
30 170 .6547 1.5275 2.3477 80 .20 2.0000 .5000 2800 
31 69 6703 1.4919 .3528 41 19 2.0647  .4843 2714 
32  .68  .6860 1.4577 .3576 82 .18 2.1344  .4685 2.2624 
33  .67  .7018 1.4249 .3621 483 417 22096 4526 2521 
34  .66  .7177 1.3933  .3664 .84 16 22913  .4364 2433 
35  .65 7338 1.3628 .3704 85 15 2.3805 4201 2332 
36  .64 .7500 1.3333 23741 86 .14 2.4785  .4035 2226 
37  .63  .7664 1.3049  .3776 187 43 2.5870 .386 2115 
38  .62  .7829 1.2773 3808 88 112 2.7080  .3693 2000 
39  .61  .7996 1.2506  .3837 89 ЛІ 2.8445  .3516 1880 
40 60  .8165 1.2247 3863 90 10 3.0000 3333 1755 
41.59 .8336 1.1996 3887 91  .09 3.1798  .3145 1624 
42 .55  .8510 1.1751 3909 92 .08 3.3912 2949 1487 
43 .57 .8686 1.1513 3928 93 .07 3.6450 2744 1343 
44  .56 .8864 1.1282 3944 94  .06 3.9581 2526 1191 
45 .55 .9045 1.1055  .3958 95. .05 4.589 2294 1031 
46 .54  .9230 1.0835 3969 96 .04 4.8990 2041 0862 
47  .53  .9417 1.0619 3978 97  .03 5.6862  .1759 0680 
48 52 .9608 1.0408 3984 98 02 7.0000 1422 0484 
49  .51  .9802 1.0202  .3988 99 01 9.9499 1005 (0267 
50.50 1.0000 1.0000 3989 
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TABLE R. CRITICAL VALUES OF THE CORRELATION COEFFICIENT 


LEVEL OF SIGNIFICANCE FOR ONE-TAILED TEST 


.05 .025 01 005 


LEVEL OF SIGNIFICANCE FOR TWO-TAILED TEST 


df 10 05 02 01 
1 9877 .9969 .9995 .9999 
2 .9000 .9500 -9800 .9900 
3 .8054 .8783 .9343 .9587 
4 .7293 .8114 .8822 9172. 
5 6694 ‚7545 .8329 .8745 
6 .6215 ‚7067 7887 8343 
T7 .5822 .6664 .7498 ‚7977 
8 5494 6319 17155 .7646 
9 .5214 .6021 .6851 .7348 
10 4973 .5760 .6581 .7079 
11 4762 .5529 6339 .6835 
12 4575 .5324 .6120 .6614 
ІЗ 4409 5139 .5923 .6411 
14 .4259 .4973 .5742 .6226 
15 .4124 .4821 15577 .6055 
16 .4000 .4683 .5425 .5897 
17 3887 4555 5285 5751 
18 .3783 .4438 .5155 5614 
19 3687 4329 5034 .5487 
20 .3598 .4227 .4921 .5368 
25 :3233 .3809 .4451 .4869 
30 .2960 .3494 4093 4487 
35 2746 3246 3810 (4182 
40 2573 3044 3578 3932 
45 2428 2875 .3384 3721 
50 .2306 .2732 .3218 .3541 
60 .2108 .2500 .2948 .3248 
70 .1954 .2319 .2737 .3017 
80 .1829 2172 .2565 .2830 
90 .1726 .2050 2422 2673 
100 1638 1946 .2301 .2540 


Source: Table R is abridged from Table VI of Fisher and Yates: “Statistical Tables for Biological, Agri- 
cultural and Medical Research," published by Oliver and Boyd Ltd., Edinburgh, and by permission of 


the authors and publishers. 
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TABLE T. ABSOLUTE VALUES OF STUDENT'S t 


Ра 
10 05 02 01 
P^ 
df .05 .025 01 005 
1 6314 12.706 31.821 63.657 
2 2.920 4.303 6.965 9.925 
3 2.353 3.182 4.541 5.841 
4 2.132 2.776 3.747 4.604 
5 2.015 2.571 3.365 4.032 
6 1.943 2.447 3.143 3.707 
7 1.895 2.365 2.998 3.499 
8 1.860 2.306 2.896 3.355 
9 1.833 2.262 2.821 3.250 
10 1.812 2.228 2.764 3.169 
П 1.796 2.201 2.718 3.106 
12 1.782 2.179 2.681 3.055 
13 1.771 2.160 2.650 3.012 
14 1.761 2.145 2.624 2.977 
15 1.753 2.131 2.602 2.947 
16 1.746 2.120 2.583 2.921 
17 1.740 2.110 2.567 2.898 
18 1.734 2.101 2.552 2.878 
19 1.729 2.093 2.539 2.861 
20 1.725 2.086 2.528 2.845 
21 1.721 2.080 2.518 2.831 
22 1.717 2.074 2.508 2.819 
23 1.714 2.069 2.500 2.807 
24 1.711 2.064 2.492 2.797 
25 1.708 2.060 2.485 2.787 
26 1.706 2.056 2.479 2.779 
27 1.703 2.052 2.473 2.711 
28 1.701 2.048 2.467 2.763 
29 1.699 2.045 2.462 2.756 
30 1.697 2.042 2.457 2.750 
40 1.684 2.021 2.423 2.704 
60 6.671 2.000 2.390 2.660 
0 1.645 1.960 2.326 2.576 


а Probability of a deviation numerically greater than 7; for use in two-sided tests. 
P Probability of a deviation greater than г; for use in one-sided tests 


Source: Table T is abridged from Table III of Fisher and Yates: ** Statistical Tables for Biological, 


Agricultural and Medical Research," published by Oliver and Boyd Ltd., Edinburgh, and by permission 
of the authors and publishers. 
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TABLE V. VALUES OF THE PARTIAL VARIANCE IN RELATION TO MULTIPLE A 


PARTIAL PARTIAL PARTIAL 

VARIANCE R VARIANCE R VARIANCE R 

.0000--.0099 1.00 .5710-.5839 .65 .9070-.9129 30 
:0100-.0297 99 .5840-.5967 64 .9130-.9187 29 
.0298--.0493 98 .5968-.6093 63 .9188-.9243 28 
0494-0687 97 .6094—.6217 .62 .9244-.9297 27 
.0688-.0879 96 .6218-.6339 .61 .9298-.9349 .26 
:0880-.1069 95 .6340-.6459 .60 .9350-.9399 25 
.1070-.1257 94 .6460-.6577 59 .9400-.9447 24 
.1258-.1443 .93 .6578-.6693 .58 .9448-.9493 23 
:1444-.1627 92 .6694—.6807 57 .9494-.9537 22 
.1628-.1809 91 .6808-.6919 .56 .9538-.9579 21 

1810-1989 90 .6920-.7029 55 .9580-.9619 20 
.1990-.2167 .89 .7030-.7137 54 .9620-.9657 19 
2168-2343 188 .7138-.7243 a .9658-.9693 18 
.2344-.2517 87 .7244-.7347 152 .9694-.9727 44 
.2518-.2689 .86 .7348-.7449 151 .9728-.9759 16 
.2690-,2859 185 .7450-.7549 .50 .9760-.9789 45 
.2860-.3027 184 .1550-.7647 49 .9790-.9817 14 
-3028-.3193 .83 .1648-.7743 48 .9818-.9843 43 
3194-3357 182 .7744-.7837 47 .9844-.9867 12 
-3358-.3519 81 .7838-.7929 46 .9868-.9889 л 
-3520-.3679 .80 .7930-.8019 45 .9890-.9909 10 
:3680-.3837 79 .8020-.8107 44 .9910-.9927 .09 
:3838-.3993 78 .8108-.8193 43 .9928-.9943 .08 
.3994-.4147 177 .8194-.8277 42 .9944-.9957 .07 
:4148-.4299 76 .8278-.8359 Al .9958-.9969 .06 
:4300-.4449 75 :8360-.8439 40 .9970-.9979 105 
:4450-.4597 74 .8440-.8517 39 .9980-.9987 04 
:4598-.4743 33 .8518-.8593 38 .9988-.9993 .03 
.4744-.4887 172 .8594-.8667 37 :9994-.9997 :02 
-4888-.5029 M .8668-.8739 36 .9998-.9999 0 
.5030-.5169 70 .8740-.8809 35 

:5170-.5307 69 .8810-.8877 134 

:5308-.5443 .68 .8878-.8943 33 

.5444-.5577 .67 .8944-.9007 32 

.5578-.5709 .66 .9008-.9069 31 


Source; From DuBois, Philip H., Multivariate Correlational Analysis. New York: Harper & Row, 1957. 
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GLOSSARY 
OF SYMBOLS 


Symbols used in statistical writings vary widely. The following list comprises most 
of the symbols used in this text. Here the prime (^) is used to indicate a statistic 
or variable that has been altered in some way. A tilde (~) indicates a predicted 
value, either of a statistic or of a variable. A sample mean is indicated as ХогМ 
and а sample standard deviation as s, with и and c as corresponding parameters. 
Other parameters are denoted by a circumflex accent ( ^) over thesymbol for the 


statistic. 


a (1) Element in matrix A. The element in the ith row and jth column is 
denoted as ais; (2) weight to be applied to the wrongs in a scoring 
formula; (3) in deviation form, the true or reliable portion of an 
observed score or value. If the error is e, then x = a + e. 

A Matrices, consisting of rows and columns with assigned meaning, are 

generally denoted by boldface capitals. Examples are A, X, Z. The 

transpose of A is A’ or A‘, while the inverse is А-1, 

Regression weight to be used with scores with original standard 

deviations. Here i is the criterion, j the variable to which the weight is 

applied and q refers to all other variables taken into consideration. 

с Number of columns. 

с (1) Coefficient of contingency expressing the relationship between two 
categorical variables; (2) covariance or mean product of pairs of 
deviations of two variables. Any covariance divided by the standard 


bua 


509 


510 


С.а 
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deviations of the variables becomes a product moment r; (3) a trans- 
formation of scores having approximately normal distribution; (4) a 
term in Bartlett's test of homogeneity of variance. 

Cumulative frequency. 

The covariance between two residual variables, q representing the 
variable or variables partialed out from the two primary variables 
i and j. 

Confidence coefficient designating limits within which, at a stated level 
of certainty, a parameter is likely to be found. 

Deviation in terms of step intervals from an arbitrary origin. 
Degrees of freedom, the number of values free to vary after one or 
more independent restrictions have been imposed on the total number 
of values. 

(1) Difference, as between two means, or the two members of a pair of 
Scores, or between pairs of ranks; (2) a modified range, (P90 — Pio). 
(1) In deviation form, the random error component of a variable. 
(e — x — a); (2) the base of the natural system of logarithms, approxi- 
mately 2.7183. 

An arbitrary symbol used to indicate an element, either variance or 
covariance, in matrix computations leading to multivariate correlation. 
Frequency of a subgroup. Frequencies in rows may be denoted as fr, 
frequencies in columns as f. In a scatter diagram the marginal 
frequencies are fz and fy, with f», indicating cell frequencies. When 
distinction is needed between observed and theoretical or expected 
frequencies, they may be denoted as f; and f; respectively. 

Ratio of two independent estimates of the population variance, which 
may be tested for significance by comparison with the appropriate F 
distribution. 

A factor matrix consisting of the correlations of each variable in R 
with each of the extracted factors. 

A hypothetical variable or factor posited to explain the intercorrela- 
tions of observed variables. In multiple factor analysis, the general 
factors may be denoted as ga, gb ... . 

A statistic reflecting skewness in a distribution. 

A statistic reflecting kurtosis in a distribution. 

The common variance of an observed variable. The object of factor 
analysis is to divide the observed variance into two portions: he, 
the communality; and 142, the unique variance. The communality is 
further subdivided into portions identical with each of the posited 
factors. 

(1) Value of the step interval; (2) as a subscript, i refers to any variable 
ог any value. When other subscripts are needed, j, k, /... may be used. 
Identity matrix. It has unity in the diagonal cells, zeroes elsewhere. 
Used in matrix equations. 

In simple analysis of variance, number of treatments or categories in 
the independent variable. 

Lower confidence limit. 


m 


"3.9 


Ғр(0.4) 


l'pt.bis. 


Feet. 


Рет, 
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(1) Moment about the mean. The value of the nth moment is Xx"|N; 
(2) a number different from N and from л; (3) multiplying factor, used 
in computations from a cumulative distribution. 

Arithmetic mean. 

Assigned or arbitrary mean or the value of an arbitrary origin used in 
computations with a frequency distribution. 

Mode. 

Number different from N, such as number of variables or categories; 
number of items in a test; number of cases within a category; or the 
number of times a test is lengthened. 

Total number of cases in a sample. 

Number of cases in one of two categories. The number in the other 
category is No. 
Proportion of cases within a category. When there are only two 
categories, p = N»/N, q = N/N and p +9 = 1. 

(1) Percentile. P; is defined as the point in the distribution below which 
j percent of the cases are found; (2) probability. 

Probable error. 

(1) The proportion of cases in a second category, with p being the 
proportion of cases in the first category; (2) an arbitrary symbol 
designating one or more variables that have been partialed out of the 


primary variable or variables. 
Semi-interquartile range, defined as (Pss — Рғ5)/2. 

(1) Product moment correlation, the covariance of two variables in z 
form; (2) number of rows. 
Correlation between a continuous 
would be expected if the dichotomous varia! 


continuously distributed. 
Partial r, that is, the product moment correlation between two residual 


variables, used to estimate the correlation between variables i and j in 
samples with no variability in q, the variable or variables partialed out. 
Part correlation, that is, the product moment r between an observed 


and a residual variable. 
Point biserial correlation, that is, the product moment r between a 


dichotomous variable and a continuous variable. 

Tetrachoric correlation, that is, an estimate of the correlation that 
would be found if two observed dichotomous variables were con- 
tinuously and normally distributed and if full information were 


available. 
Reliability or self-correlatio 


estimate. 
(1) Multiple correlation, th 


and a dichotomous variable that 
ble were normally and 


n of a variable, available only as an 


at is, product moment r between an un- 
modified variable on the one hand and the weighted sum of two or 
more variables on the other, the weights being developed so that, in the 
sample, the correlation is at a maximum; (2) rank; (3) in a scoring 
formula, number of right answers. 

A matrix of correlations. 
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Standard deviation in the sample, defined as /Xx*/N. 

Standard error of estimate, defined as the standard deviation of the 
errors around the regression line used in estimating variable 0. 
Partial standard deviation, or s of variable i after the variable or 
variables denoted as q have been partialed out. 

Standard error of measurement or the expected standard deviation of 
the differences between observed scores and true Scores. 

With X indicating any statistic, 5% refers to the standard error of the 
statistic as estimated from a sample. Examples are sar and 5м)-мо. 
Skewness. 

Standard score. A linear transformation of obtained scores, with 
assigned mean and standard deviation. 

Theoretical distribution of a normally distributed variable divided by 
ху; also an observed statistic tested by this distribution. 

(1) Standard score with arbitrary mean of 50 and arbitrary standard 
deviation of 10; (2) the smaller sum of ranks in the matched pairs, 
signed-ranks test; (3) total score. 

The unique component of any observed variable, consisting of error 
and specific variability. 

(1) Upper confidence limit; (2) a statistic involving ranks that may be 
tested for significance by a procedure developed by Mann and 
Whitney. 

The variance, or mean of the squares of the deviations from the 
arithmetic mean. 

In analysis of variance, the between groups variance or the between 
groups sum of squares divided by the appropriate number of degrees 
of freedom. The ratio, V»/V,,, yields F. 

Within groups variance, or within groups mean Square. 

Partial variance. The variance of variable i after the variable or 
variables denoted as q have been partialed out. 

Weight applied to variable i. 

In a scoring formula, number of wrong answers. 

Deviation from the sample mean. 

Deviation in coded scores from an arbitrary mean. 

Any measured variable in original units. 

Any statistic. 

Height of the ordinate of the normal curve. 

Symbol alternate to X, indicating any measured variable in original 
units. 

A standard score with mean of zero and standard deviation of unity. 
Also, the number of standard deviation units above or below the 
mean. 

A residual variable in z form, q referring to the variable or variables 
partialed out and which are uncorrelated with 24.2. In such “higher 
order z scores" the variance is less than 1.00. 

Fisher’s transformation of the correlation coefficient, a transformation 
which varies without limit and is distributed approximately normally. 
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(Alpha) | (1) Selected level of significance; (2) constant of a Poisson 
distribution, in which и = c? = о. 

(Beta) Regression weight applicable to z scores. 

(Delta) Determinant. 

(Eta) The correlation ratio, measuring the fit of the observations to 
the means of the vertical or horizontal arrays. 

(Mu) Parameter mean. 

(Nu) Number of degrees of freedom. 

(Rho) Coefficient of rank correlation developed by Spearman. It is a 
variant of product moment r. 

(Sigma) Parameter standard deviation. 

(Sigma) Summation sign. 

(Tau) Coefficient of rank correlation developed by Kendall. 

(Phi) Product moment r between two dichotomous variables. 

(Chi) Chi square, used to determine the probability that a distribu- 
tion of frequencies within categories is in accordance with a stated 
hypothesis. 


Combinations of n things r at a time. 


Equals approximately. 

Does not equal. 

Greater than; less than. 

Greater than or equal to; less than or equal to. 
Factorial. Factorial N is the product of N terms: 
N(N — YXN — 2)... 1. 


INDEX 


INDEX 


Accuracy standard, of predictions, 244 
Actuarial predictions, defined, 243 
Addition: of matrices, 425 
requirements for, 99-100, 101 
See also Additive properties 
Additive properties: chi square, 319 
correlation coefficients, 219, 220-221 
covariances, 215-219 
variances, 215-219 
Adjoint matrix, 445 
Adjugate matrix, 445 
Alienation, coefficient of, 249 
Alternate forms, in reliability estimation, 
390-391, 394 
Analysis of variance, 357-386 
Bartlett’s test of variance homogeneity, 
375-376 
complex designs using, 377 
data arrangement in, 360-361 
degrees of freedom in, 366-367 
described, 357 
differences among means and, 359, 
369-371 
emphases in, 358-359 
F distribution in, 358 
F test in, 369, 371-372 
null hypothesis and, 358, 359, 362 


Analysis of variance (Continued) 


principles, 362 
purpose of simple, 374-375 
randomized group design, 368-371 
relationships of sums of squares to r 
and eta, 372-375 

significance testing and, 358, 365-366 
single independent variable in, 367-371 
sum of squares in, 360, 362-366 
terminology, 360 
two-way, 378-383 

Applied psychology, 24, 240-241 

Area sampling, 327-328 


Areas, under normal curve, 278, 279-287, 


484-485 
Arithmetic mean, defined, 14, 99 
See also Mean 
Artificial matrix, single common factor 
and, 455-457 
Associations, see Relationships 
Asymmetrical matrix, finding inverse of, 
448 
Attenuation, correction for, 400-401, 452 
Average(s), types, 99 
See also Mean 
Average deviation, 104 
Average rank, 75 
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Averaging, 18 
description by, 99-124 
frequency distribution and, 112 
Axes, in factor analysis, 458-460, 
465 


b weights, defined, 165 
Balancing cases, 194 
Bartlett's test of homogeneity of variance, 
375-376 
Behavior, forecasting human, 239-268 
Beta (8) coefficients: checking, 173, 179, 
182, 183 
computation, 169-170, 173, 175-177, 
179, 182, 183; from matrix equa- 
tion, 434-438 
defined, 165 
orders of, 166 
Suppressor variables and negative, 184 
Bias, in estimates of parameters, 328-329 
of statistics, 351 
Bifactor method in factor analysis, 464 
Bimodal distribution, 82 
Binomial distribution, 304-309, 312 
application of, 306-307, 309 
formula, 273-274, 305 
shape of, 304 
skewness and kurtosis, 306 
Binomial expansion, 304 
normal curve and, 274-175 
PES correlation, 221, 223-224, 229- 
30 
applications of, 223-224 
computation, 223 
formula for, 229-230 
as validity coefficient, 403 
Bivariate, defined, 57n 
Bivariate data, curvilinear correlation 
and, 232, 234-235 


C scale, in normalizing a distribution, 
298-300 
Cases, see Sample 
Categorical data, 30, 32 
chi square and inferences from, 52-72 
prediction with, 42-48 
Categories: combining, 67 
determination, 32 
nominal scale establishment, 31, 32 
Statistics applicable to, 31 
terminology, 30-31 
Cattell, James McKeen, 26 
Cell, defined, 360 
Central limit theorem, 300-301 
chi square distribution and, 317 
Central tendency: measures of, 14, 33, 
82, 99, 103 
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Central tendency (Continued) 
measures of variability and, 87 
normal curve, 276 
Centroid solution, in factor analysis, 464 
Chance, frequency distributions and, 303- 
325 
prediction and, 247-248 
See also Probability 
Charlier's check, for scatter diagram, 151 
Charts: expectancy, 42-44, 242-243, 245 
preparation of, 6-10, 12 
time dimension in, 10-11 
Chi square (x?), 52-72 
addition of, 69, 319 
association testing, 54-55 
categorical data application, 31 
central limit theorem and, 317 
characteristics of, 312-313, 319 
in comparing two empirical distribu- 
tions, 55 
computation and interpretation exam- 
ples, 63-67 
degrees of freedom and: distribution 
and, 313-317, 318; with four degrees 
of freedom, 60, 61, 63-64, 65-66; 
with six degrees of freedom, 61, 66- 
67; with two degrees of freedom, 56, 
57-60, 61; Yates' correction for con- 
tinuity for 1df, 67-68 
distribution of, 56, 57-61, 486; theoreti- 
cal and empirical, 312-319 
F distribution and, 322 
formulas for, 55-57 
functions of, 52-55 
interpretation of, 297 
as nonparametric statistic, 471 
normality computation with, 
297 
null hypothesis and, 53-54, 55 
Poisson distribution testing with, 310- 
312 
relative frequencies of, 57-61 
significance testing, 62-63 
small frequencies and, 67 
steps in testing hypotheses, 53-54 
in testing distribution for normality, 
294-297 
Coded scores, 112-119, 140-152 
Coding, computation and, 112-119 
error introduced by, 118 
formulas for M and s based on, 118- 
119 
Coefficient(s) : of alienation, 249 
confidence, 352 
in multiple R computation, 171-172 
See also under specific coefficient, 
e.g., Beta coefficients; Correla- 
tion coefficient 


294- 


Cofactor of determinant, 439 
Columns of matrices, 420-421 
Common factor, single, in factor analy- 
sis, 453-457 
Compound probability, 271-272 
Computation: correction for 
changes, 260 
cumulative frequency procedure in, 117 
inverse matrix, 446-449 
multiple factor analysis, 463-465 
normal equations, 434 
See also under specific topic 
Condensation, pivotal, 441 
Confidence coefficient, 352 
Confidence intervals, 352-354 
Confidence limits: for mean, 353-354 
for normally distributed statistic, 353- 
354 
Consistency: internal, of tests, 408-409 
standard error of measurement in in- 
terpretation, 401-402 
Contingency coefficient (C), 41-42 
associated nominal scales, 40 
categorical data application, 31 
chi square and, 69 
computation of, 36-42 
as distribution-free statistic, 478 
formula, 37 
measuring relationship with, 35-42 
Continuity, Yates’ correction for, 67- 


range 


68 
Continuous variables, prediction with, 
244-245, 249 


Control of extraneous variables, 191-199 
case balancing and, 194 
case matching and, 193 
case randomization and, 194-199 
case selection and, 192-193 
defined, 191 
residual variable formation and, 199 
variable elimination and, 192-193 
Correction(s): for attenuation, 400-401, 
452 
for changes of range in prediction, 255- 
262 
for continuity, Yates’, 67-69 
for guessing, in test scoring formulas, 
405 
independent variable, with known cri- 
terion variances, 261 
in prediction, 249, 255-262 
for restriction, with known predictor 
variances, 261-262 
Sheppard's correction for standard de- 
viation, 118 
Correlated means, 343, 346-348 
significance of differences 
346-348 


between, 
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Correlation: between true and obtained 
test scores, 402-403 
biserial, 221, 223-224, 229-230 
changes in reliability and, 398-403 
comparison of partial r with r within 
groups, 202-205 
correction for attenuation, 400—401 
curvilinear, 231-235 
dichotomous variables, 242, 243, 245- 
248 
differential weighting in, 219 
estimation from variability around re- 
gression line, 256-257 
frequency distribution and, 155 
matrices of, factor analysis and, 453- 
470; matrix of r's as product of two, 
428-432 
nature of, 125-128 
part, 206, 208-210 
partial, 201-208 
phi coefficient, 224-228 
point biserial, 221-224 
prediction and, 136, 155-156, 205, 243, 
245, 262-266 
rank, 90-95; See also Rank correlation 
of ratios and residuals, 200-201 
regression and, 126, 130-132 
relationships and, 125-128, 155-156 
significance of differences between, 
348-350 
standard deviation and, 107 
in terms of determinants, 442-445 
tests with outside criteria, 403 
tetrachoric, 226-228, 230-231 
two variables: graphing relationship 
between, 127-128; linear, 125-163; 
regression equation, 130-132 
variables in, 126 
variance in, 106; division of variance, 
133-134, 135-136 
See also Correlation coefficient; 
Multiple correlation 
Correlation coefficient: additive proper- 
ties of, 219, 220-221 
computation, 137-153; from coded 
scores, 140-152; from raw scores, 
138-140, 144; from variances, 152- 
153 
corrected for attenuation, 452 
critical values of, 493 
derivation of, 129-130 
difference formula and sum formula 
for, 152-153 
factors affecting, 156 
Fishers 2, transformation, 
339-340 
formulas, 136-141, 152-153 
interpretation, 153-156 


337-338, 
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Correlation coefficient (Continued) 
measurement units and. 154-155 
number of cases (N) and, 153 
prediction and, 136, 155-156 
purpose of, 375 
range changes and, 156, 257 
relationships of sums of squares from 

analysis of variance with, 372-375 
significance level, 333-335, 336-337 
338-339, 493; testing, 338-339 
as slope of line of best fit, 128-130 
standard error of, 332-333 
t distribution application, 335-337 
variants of Pearson's, 221-229 
Correlation ratio (eta): computation of, 
231-234 
purpose of, 375 
relationships of sums of squares from 
analysis of variance with, 372-375 
significance of, 234 
Correlational analysis, inverse of com- 
plete matrix in, 447-448 
Counting, description by, 30-51 
Covariance(s): additive properties, 215- 
219 


as average, 136-137 

of composite variables, 217-219 

defined, 137 

matrix of r's as product of two mat- 
rices, 428-432 

in multiple and 
210-211 


residual variables, 166-169, 169-170, 
171 


partial correlation, 


of weighted composites, 220-221 
Criterion: in multiple correlation, 165 

predictable variance of, 134-135 

Predicting dichotomized, 249-251 

in prediction, 240-241 

prediction of continuous, from continu- 

ous predictor, 258-259 

from regression equation, 126 

standard deviation of, 136 

unpredicted variance of, 134, 135 
Criterion variables, 126, 190-191 
Critical point, in prediction, 254-255 
Cross-validation, 265-266 


of multiple correlation, 182-183, 
185 


of prediction, 48 
Cumulative frequencies: 
79 
procedures in computation, 117 
in scatter diagrams, 147-152 
Curves: in expressing relationship be- 
tween two variables, 9-10 
with time dimension, 10-11 
See also Normal curve 


described, 77, 
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Cut-off points in prediction, 254-255 
Curvilinear correlation, 231-235 
correlation ratio as measure of, 231- 
234 mue 
fitting curvilinear function to bivariate 
data, 232, 234-235 


Data: arrangement, in analysis of vari- 
ance, 360-361 
bivariate, curvilinear correlation and, 
232, 234-235 
categorical, 30, 32, 42-48, 52-72 
in distribution-free tests, 472 
in psychological tests, 387-388 
See also under specific topic 
Deciles, score conversion into, 90 
Deduction, 326-356 
normal curve and, 284-286 
steps in, 340-341 
See also Forecasting; Prediction 
Degrees of freedom: in analysis of vari- 
ance, 366-367 
chi square distributions and, 313-317, 
318; chi square with four, 60, 61, 
63-64, 65-66; chi square with six, 
61, 66-67; chi square with two, 56, 
57-60, 61 
concept of, 35 
sum of squares, 365, 366-367 
Yates’ correction for continuity for 1df, 
67-69 
Dependent variables, 126, 190-191 
Descriptive statistics, defined, 2 
in experimental psychology, 23-24 
formulas in, 18, 20-23 
generalization and, 15-16, 326-330 
graphical methods, 2-12 
in ordinal vs. nominal Scales, 74-75 
prediction and, 16 
in professional Psychology, 24-25 
ranking and, 75-76 
sources of psychological, 25-26 
symbols in, 18-20 
See also Statistics 
Determinants, 419, 438-445 
cofactor of, 439 
evaluation of, 439-441 
matrices as, 420 
in matrix summarizing, 444-445 
minors of, 439 
multiple correlation 
444-445 
Partial correlation and, 443-444 
Pivotal condensation of, 441 
terminology and notation, 438-439 
Deviation: average, 104 
defined, 103n 


and, 442-443, 


Deviation (Continued) 
from mean, as variability indicator, 104 
See also Standard deviation 
Deviation form of regression equation, 
132n 
Diagonal matrix, 423 
Diagonal method of factor analysis, 463 
Dichotomous criterion in prediction, 249- 
251 
Dichotomous test items, 388 
Dichotomous variables: expectancy 
charts, 242, 243 
in prediction, 245-246, 247-248 
Difference formula for correlation coeffi- 
cient, 152-153 
Differences; between correlations, 348- 
351 
between means, 342-348, 359 
statistic from hypothetical value, 351- 
352 
tests based on algebraic signs of, 476- 
478 
Difficulty of test responses, 406-407 
Digits, see Numbers 
Distractors in tests, 407-408 
Distribution-free statistical tests, 471—481 
arithmetic operations for, 471-472 
based on algebraic signs of differences, 
476-478 
coefficient of contingency and, 478 
median test, 478 
rank correlation and, 472-474 
ranking and, 472-476 
run test, 474—475 
sign test, 476-478 
Distributions: bimodal, 82 
binomial distribution, 304-309, 312 
central limit theorem, 300-301 
chance and, 272-274, 303-325 
checking, 78, 115-116 
of chi square, 56, 57-61; degrees of 
freedom and, 56, 60, 61; theoretical 
and empirical, 312-319 
chi square statistics of, 54, 55, 294 
297 
computations from, 112, 113-119, 120 
defined, 329 
description of, 14-15 
F distribution, 312, 321-323, 488-489 
F test and, 375 
formulas yielding functions of, 22-23 
hypotheses in testing, 303-304; chance 
and, 272-274 
inferences from, 269; with assumed 
normality, 284-287; chi square and, 
294-297; normal curve and, 281- 
284 
kurtosis in, 292-294, 306 


INDEX 521 


Distributions (Continued) 
leptokurtic, 293 
mean computation from, 101-102, 112, 
113-119 
mesokurtic, 293 
multimodal, 82 
nature of theoretical, 269-274 
normal, 274-275; inferences from, with 
assumed normality, 284-287; limits 
enclosing percentages of, 281-284; 
normality testing, 294-297 
normalizing a distribution, 297-300 
percentile ranks of, 90 
platykurtic, 293 
point measure computation from, 112 
Poisson distribution, 309-312 
of a statistic, 329-330 
shape of, 304 
skewness, 287-292 
standard deviation computation from, 
112, 113-119 
t distribution, 312, 319-321 
tallying, 78, 80 
unimodal, 82 
variance computation from, 112, 113- 
119 
See also Curves; Frequency dis- 
tribution; Normal curve; Proba- 
bility 


Elderton, W. P., 61 
Empirical scoring formulas, 405-406 
Equations: factor analysis, 461-462 
matrix, 424—425; reading, 448—449 
Error of the first kind, 341-342 
Error of the second kind, 342 
Error term in analysis of variance, 381- 
382, 383 
Error(s): introduced by coding, 118 
in prediction, 244, 249; standard devia- 
tions of, 135-136 
probable, normal curve and, 287 
random, 388-389, 390 
types, in statistical tests, 341-342 
variance, 14 
See also Standard error 
Estimate, standard error of, 136, 249 
Estimation: formulas, 22 
parameters of, 328-329 
Expectancy chart, 42-44, 242-243, 245 
Experimental psychology, 23-24 


F distribution, 312, 321-323 
in analysis of variance, 358 
chi square and, 322 
defined, 358 
variance ratio and, 365 
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F ratio, 363 
significance level of, 375 
in two-way analysis of variance, 383 
F test, 358, 365-366 
in analysis of variance, 369, 371-372, 
381, 383 
distribution and, 375 
heterogenous variances and, 376 
Factor analysis, 452-470 
axes in, 458-460, 465 
bifactor method in, 464 
centroid method, 464 
diagonal or square root method, 463 
fitting a single factor, 462-463 
graphical representation of variables, 
458-462, 465 
limitations of, 460-461 
maximum likelihood method of, 465 
multiple, 463-465 
Principal axes solution in, 464-465 
Purposes of, 465-466 
report reading, 465-468 
single common factor in, 453-457 
steps in, 466 
Factorial study, defined, 361 
Fechner, Gustav, 25 
First quartile computation, 83 
Fisher, Sir Ronald A., 26, 61, 321 
Fisher's z, transformation of correlation 
Sa MAU 337-338, 339-340, 504- 


Forecasting human behavior, 239- 
268 
instruments of, 242-243 
steps, 16 


See also Prediction 
Formulas: classes of, 21-23 

computing, 22 

defined, 18 

definitions as, 21-22 

understanding, 20-21 

See also under specific formula 

Fourfold table, for prediction, 247-248 
Frequencies: combining small, 67 

cumulative, 77, 79, 117, 147 

relative, of chi square, 57-61 
js d distribution, assumptions in, 


checking, 78 

data in, 78-79 

nature of, 77-78 

polygon construction, 5-6, 76-80 
See also Distributions 


Galton, Sir Francis, 25, 26, 126n 
Generalization from descriptive Statistics, 
15-16, 326-330 
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Goodness of fit, see Chi square 
Gossett, William S., 319 
Gramian matrix, 445 
Graphing, in descriptive statistics, 2-12 
in factor analysis, 458-462, 465 
relationship between two variables, 
127-128 
Group predictions, 43, 243ff. 
Groups, comparison of partial r with r 
within groups, 202-205 
Guessing, correction for, in test scoring 
formulas, 405 


Heterogeneous tests, 387-388, 394, 396 


Heterogeneous variances, F test and, 
376 

Histogram: chi square distributions, 316, 
317 


preparation, 3-5 
Homogeneous psychological tests, 387— 


388, 394, 396 

Hull, Clark, 26 

Hypothesis testing: chance distribution in. 
272-274, 


chi square in, 52-54 

F test in, 365-366 

steps in, 340-341 

t test in, 366 

theoretical distributions in, 303-304 


Identity matrix, 423, 432-434 
Independent variable, 126, 190-191 
Individual predictions, 43, 243ff. 
Interpretation: of chi square, 
297 
contingency coefficient, 39 
correlation coefficient, 153-156 
percentiles and percentile ranks, 89- 
90 


63-67, 


raw scores, 411 
standard errors, 332 
variance, 106 
Interval scales, 18 
Intervals, step, 78, 79-80, 142, 146 
Inverse matrix, 433-434 
computation, 446-449 
Item analysis, defined, 406 
difficulty or Popularity of responses, 
406-407 
external validity, 409 
internal consistency, 408-409 
item selection by approximations to 
multiple R, 409-410 
popularity of distractors, 407-408 
test variance and йет statistics, | 
410 


Kelley, Truman L., 26 
Kendall's tau, 90, 92-95, 478 
KR formula 20, 393-394, 396-398 
Kurtosis in frequency distributions, 292- 
294 
formula, 293-294 
testing for, 292-293, 306 


Least squares, defined, 128 

Length of tests, reliability and, 393 
validity and, 403-404 

Leptokurtic distributions, 293 

Levels in analysis of variance, 360 


Levels of significance, see Significance 


_ level 
Linear relationships, 126-127 
Linear transformation, 107-111 


McCall, William A., 26 


Matched-pairs, signed-ranks test, 475- 


476 

Matching of cases, 193 

Matrices, 419-438 

addition and subtraction, 425 

adjoint, 445 

algebraic operations with, 425-432 

artificial, single common factor and, 
455-457 

asymmetrical, 448 

in beta computation, 176-177 

concepts of, 419-420 

of correlations, factor analysis and, 
453-470 

defined, 168n 

determinants as, 420, 444-445 

equality of, 425 

equations, 424—425; beta computation 
from, 434-438; establishment of, 
435-436; identity matrix in, 432- 
434; reading, 448-449; solution of, 
436-438 

gramian, 445 

inverse, 433-434; computation, 446- 

. 449 

in multiple R computation, 171-173, 
174-175 

multiplication of, 425-432, 447, 460 

order, 420-421 

In partial correlation computation, 205- 
206, 207 

regular, 433 

rows and columns of, 420 

scalar, 421, 423 

singular, 433, 445 

Square, 433, 438 

Symmetric, 423 
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Matrices (Continued) 


transposition of, 424, 434 
types of, 423 

unity, 423, 432-434 
variance-covariance, 215-219 
vectors of, 421 

zero, 423, 434 


Maximum likelihood method of factor 


analysis, 465 


Mean(s): bias in, 328 


as central tendency measure, 82, 99, 
103 
characteristics, 103 
of chi square, 317 
combining statistics from different sam- 
ples, 120-121 
computation, 18, 101-103; from fre- 
quency distribution, 101-102, 113- 
119, 120 
correlated, 343, 346-348 
correlation coefficient as, 136-137 
deviations from, as variability indica- 
tors, 103-104 
differences between, 342-348; analysis 
of variance and, 359, 369-371; sig- 
nificance of, 343-348; standard de- 
viation and, 107 
formulas based on coding, 118-119 
normal curve, 276, 280 
Poisson distribution, 309-310 
standard error of, 331 
standard scores and, 107-111 
of sum of several variables, 121-122 
uncorrelated, 343-346 
visual checks, after computation from 
frequency distribution, 120 
Mean deviation, 104 
Mean squares, in analysis of variance, 
368, 371 
defined, 365 
Measurement: correlation coefficient and 
units of, 154-155 
standard error of, 401-402 
See also Point measures 
Median, defined, 82 
as measure of central tendency, 82, 
103 
normal curve, 276 
standard error of, 332 
Median test, 478 
Mesokurtic distributions, 293 
Midrank, 75 
Mode: categorical data application, 31, 
32-33, 34-35 
as central tendency measure, 33 
of chi square, 317 
of frequency distribution, 82 
normal curve, 276 
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Modified range to measure variability, 
86 
Multimodal distribution, 82 
Multiple correlation, 164-189 
criterion in, 165 
cross validation, 182-183, 185 
computation of multiple R, 170-175, 
179, 182, 183; checking, 170, 172, 
173, 179, 182, 183; from partial 
variance, 175, 503; with two pre- 
dictors, 169-170 
defined, 165 
determinants and, 442-443, 444-445 
nature of, 164-165, 185, 186 
partial correlation compared with, 210- 
211 
Partial variance in relation to, 503 
prediction and, 205, 262-266 
regression coefficients, 164-166 
residual variables of, 209-210 
shrinkage of multiple, 184-185 
Suppressor variables, 184 
test item selection by approximations 
to, 409-410 
variables in, 
177-184 
variances in, 210, 211 
Multiple cut-offs in prediction, 266 
Multiple regression equation, 262-263, 
265-266 


Multiplication of matrices, 425-432, 447, 
60 


165-169; selection. of, 


Nominal scales, 17 
basic operations with, 31, 32-42 
defined, 30 
establishment, 31-32 
ordinal scale compared with, 73-74, 
74-75 
in prediction, 241; differential effective- 
ness of prediction, 47; joint predic- 
tion from several, 43-47 
use, 33-35 
See also Categorical data 
Nonlinear relationships, 126-127 


Nonparametric statistical tests, 471- 
481 
See also Distribution-free statisti- 
cal tests, 


Normal curve, 274-287 
applications of, 281-284, 303 
areas under, 278, 279-281, 484-485 
central limit theorem, 300-301 
chi square and, 318 
confidence limits and, 353-354 
described, 304 
development of table of, 277-279 
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Normal curve (Continued) 

distribution inferences 
287 
formulas, 275-276 
kurtosis and, 292-294 
mean of, 276, 280 
normalizing a distribution, 297-300 
ordinates of, 276, 277-279, 284, 490- 
491, 492 

probable error and, 287 
properties of, 276-277 
reading table of, 280-281 
skewness and, 287-292 


from, 284- 


standard deviation, 107, 2755, 276- 
277, 278, 287 
Normalization of obtained distribution, 
297-300 


Normalized scores, 414 
Norms: defined, 14 
establishment, for tests, 410-414 
reference, 411-412 
standard scores, 413-414 
statistical, 411, 412-414 
Notation in analysis of variance, 380 
Null hypothesis, 53-54, 55, 334, 358 
analysis of variance and, 358, 359, 
362; simple analysis of variance in 
testing, 367-371 
described, 340 
differences between means and, 342 
significance level and, 53-54, 340-342 
Null matrix, 423 
Number of cases (N) 
in categorical data, 31, 32-33, 34-35 
correlation coefficient and, 153 
small, confidence limits and, 354 
Numbers: construction and use of table 
of random digits, 195-198 
square roots from 1 to 999, 498.501 
squares from 0.1 to 99.9, 494-497 


Order-of-merit scales, 74 
Ordinal scales, 17 
frequency distribution and, 76-80 
nominal scale compared with, 73-75 
Ordinates of normal curve, 276, 277-279. 
284, 490—491, 492 


P values of ordinate of normal curve, 
492 
Parameters, estimation of, 328-329 
Part correlation: applications оғ, 
209 
computation, 206, 208 
multiple, 210 
Partial correlation, 201-208 


Partial correlation (Continued) 
comparison of partial r with r within 
groups, 202-205 
computation, 202, 205-208 
defined, 201 
determinants and, 443-444 
higher-order, 205 
multiple, 209-210 
multiple correlation 
210-211 
value of, 205 
variances in, 210-211 
Partial standard deviation, 134 
Partial variance, 134, 175, 503 
Pearson, Karl, 25, 61, 1257, 289 
Pearson's product-moment correlation co- 
efficient, variants of, 221-229 
Percent ranks, 75-76 
Percentages, computation, 33, 
35 
Percentile ranks, 76, 412-413, 414 
calculation, 87-89 
distribution of, 90 
interpretation, 89-90 
Percentiles, 412—413, 414 
checking, 83-85 
computation, 80-82, 82-85 
defined, 76 
interpretation, 89-90 
point measures of variability, 85-87 
Personnel psychologist, 24 
Phi coefficient, 224-228 
applications, 227-228 
computation of, 225-226 
in prediction, 245-246, 247-248 
Pie Chart preparation, 2-3 
Pivotal condensation of determinants, 
441 
Platykurtic distributions, 293 
Point biserial correlation, 221-222 
applications of, 223-224 
computation, 223 


compared with, 


34- 


in validity coefficient computation, 
403 
Point measures: computation of, 80-82, 
82-85, 112 


of variability, 83 
Poisson distribution, 309-312 
Popularity, of test distractors, 407-408 
Of test responses, 406—407 
Population, deductions about, 326-356 
sampling, 327-328 
Prediction, 16 
in bivariate distribution, 256-257 
with categorical data, 42-48 
chance and, 247-248 
Correction for changes in range, 255- 
262 
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Prediction ( Continued) 
correlation and, 136, 
243, 245, 262-266 
criteria in, 240-241; continuous crite- 
rion from continuous predictor, 258- 
259; dichotomized criterion, 249- 
251 
critical point, 254-255 
cross validation of, 48 
cut-off points, 254-255, 266 
dichotomizing and, 245-246, 
251 
differential effectiveness, 47 
errors in, 135-136, 244, 249 
expectancy chart, 242-243, 245 
fourfold table for, 247-248 
group and individual, 43, 24311. 
multiple correlation and, 205, 262-266 
multiple cut-offs in, 266 
from normal curve, 285-287 
from a number of nominal scales, 43- 
47 
phi in, 245-246, 247-248 
psychological, 239-242 
scales in, 241 
scores, regression equation in, 132-134 
Taylor-Russell tables, 251-253 
in terms of cost and utility, 253 
variables in: continuous variable, 244- 
245, 249; dichotomous variable, 
245-246, 247-248; one variable 
from another, 244-245, 249; team 
of variables in, 262-266 
See also Forecasting 
Predictor variable; see Independent vari- 
able 
Predictors; see Variables 
Principal axes solution in factor analysis, 
464-465 
Probable error and normal curve, 287 
Probability: compound, 271-272 
of departure from fixed value, 351- 
352 
formulas, 271-273; binomial expan- 
sion, 273-274; normal, 275-276 
hypothesis testing with chance distribu- 
tion, 272-274 
normal curve and, 274-284 
null hypothesis and, 341 
numerical values, 270 
simple, 269-270 
See also Chi square; Distributions; 
Normal curve 
Product-moment, defined, 125 
Product-moment correlation, see Corre- 
lation 


iic" ad psychology, statistics in, 24— 


155-156, 205, 


249- 
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Proportion: 
278, 279-291, 484-485 
bias in, 328 
categorical data application, 31 
computation, 33, 34-35 
Psychology, prediction in, 239-242 


Quota sampling, 327 


Random digits, table of, 195-198 
Random error: defined, 388 
in tests, 388-389; reliability and, 390 
Random sampling, 327 
Randomization of cases, 194-199 
Randomized group design, 368-371 
Range: modified, 86 
semi-interquartile, 86-87 
Range Changes: correction for, 255-262 
correlation coefficient and, 156, 257 
test reliability and, 402 
Rank correlation, 90-95 
coefficient of, 472; as distribution-free 
statistic, 472-474, 478 
Kendall's tau, 90, 92-95, 478 
Spearman's rho, 90-92, 94, 228-229 
Rank order, see Ranks 
Ranking, 73-98 
in correlation, 90-95 
described, 17-18 
descriptive statistics based on, 75-76 
distribution-free statistical tests and, 
472-476 
methods of, 74-75 
Ranks: average, 75 
computation, 75 
percent, 75-76 
percentile, 76, 87-90, 412-413, 414 
Ratio scales, 99-101 
Rational equivalence method for test re- 
liability, 393-398 
Ratios: residual variables as, 200-201 
variance, in analysis of variance, 
362 
Raw scores: correlation coefficient com- 
. Putation from, 138-140, 144 
interpretation, 411 
regression equation, 131-132, 263-264 
Reference norms for scores, 411-412 
Regressed scores, 210 
Regression, defined, 1265 
linear, with two variables, 125-163 
Regression coefficients, in multiple corre- 
lation, 164-166 
types, 165 
Regression equation, defined, 126 
multiple, 262-263 


area under normal curve, 
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Regression equation (Continued) 
raw-score, 131-132, 263-264 
residual variables in, 210 
two-variable, 130-132 
utility of, 265 266 
variance division, 133-134, 135-136 
Regression line, 127-128 
homoscedasticity around, 134-136 
r as slope of line of best fit, 128-130 
with two variables, 125-163 
Regular matrix, 433 
Relationships: chi square in testing pres- 
ence of, 54-55 
contingency coefficient in measurement, 
35-42 
correlation and, 125-128, 155-156 
forecasting and, 16 
linear and nonlinear, 126-127 
measurement of, 15, 35-42; 
measures of, 215-238 
two variables, formulas and, 23 
between variables: prediction from, 
240; preparation of charts ex- 
pressing, 7 
Reliability of tests, 389-403 
alternate forms in estimation, 
391, 394 
correction for attenuation, 400-401 
correlation and changes in, 398-403 
correlation between true and obtained 
scores, 402-403 
defined, 390 
random error and, 390 
range changes and, 402 
rational equivalence method in estima- 
tion, 393-398 
Spearman-Brown "prophecy" formula, 
392 
"split-half" method in estimation, 391- 
392, 394 
standard error of measurement and 
consistency, 401-402 
test length and, 393 
test-retest method and, 390, 394 
variable modification and, 400 
Remainder sum of squares, 381-382, 383 
Replication, defined, 361 
Residual scores, 210 
Residual variables, 
171, 199-201 
control through forming, 199 
described, 166-167 
multiple correlation of, 209-210 
in part correlation, 209 
partial correlation, 201-208 
ratios as, 200-201 
in regression equations, 210 
variance of, 134 


special 


390- 


166-169, 169-170, 


Restriction, correction for, 261-262 

Rotation of axes in factor analysis, 458- 
460, 465 

Run test, 474-475 


Sample: area sampling, 327-328 
balancing, 194 
categories of, 30-32, 67 
quota, 327 
random, 327 
randomization of, 194-199 
requirements for, 328 
sampling methods, 327-328 
selection, as control, 192-193 
small frequencies, 67 
stratified, 327 
See also Data; Population 
Scalar, 421 
Scalar matrix, 423 
Scales: interval, 18, 99-101 
in prediction, 241 
ratio, 99-101 


Scatter diagrams, checking, 144, 146, 
151 
correlation coefficient computation 


from, 141-152 
expectancy chart and, 242-243 
preparation of, 141-142, 144, 146 
tallying in, 142 
Scores: coded, correlation coefficient 
computation from, 140-152 
correlation between true and obtained, 
402-403 
normalized, 414 
Percentiles and percentile ranks, 412- 
413, 414 
random errors in, 388-389 
ranking, 17 
raw: correlation coefficient computa- 
tion from, 138-140, 144; interpreta- 
tion, 411; regression equation, 263- 
264 
reference norms for, 411-412 
regressed, 210 
regression equation and, 131-132, 263- 
264 
standard, 107-111, 413-414 
statistical norms for, 411, 412-414 
T scores, 109-111, 413 
Scoring formulas for tests, 404-406 
Semi-interquartile range (О), computa- 
tion, 86-87 
defined, 86 
Shrinkage in multiple R, 265 
Sign test, 476-478 
Significance level: analysis of variance 
in testing, 358, 365-366 
of chi square, determination, 62-63 
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Significance level (Continued) 

correlation coefficient, 333-335, 336- 
337, 493; testing, 338-339 

of correlation ratio (eta), 234 

differences between correlated means, 
346-348 

differences between correlations, 348- 
350 

differences between uncorrelated means, 
343-346 

F ratio, 375 


null hypothesis and. 53-54; testing, 
340-342 
testing, 338-339, 340-342, 351-352, 


358, 365-366 
Simple probability, 269-270 
Singular matrix, 433, 445 
Size of sample, 328 
Skewness, in chi 
316-317 
formulas, 289-290 
of frequency distributions, 287-292 
reasons for, 287-288 
testing, 290-202; binomial distribution, 
306 
Solution: of determinants, 440-441 
of matrix equations, 436-438 
Spearman, Charles, 25 
Spearman-Brown "prophecy" formula for 
reliability, 392 
Spearman's correction for 
400-401 
Spearman's rank correlation coefficient 
(rho), 228-229 
computation, 90-92, 94 
as distribution-free statistic, 472-474 
Split-half reliability estimation, 391-392, 
394 
Square matrix, as determinant, 438 
inverse of, 433-434 
Square root method of factor analysis, 
463 
Square roots of integers from 1 to 999, 
498-501 
Squares of numbers from 1 to 99.9, 494— 
497 
See also Sum of squares 
Standard deviation, bias in, 328-329 
combining statistics from different sam- 
ples, 120-121 
computation, 106-107; after coding, 
113-119; from frequency distribu- 
tion, 113-119; from raw scores, 101- 
102, 106-107 
correlation coefficient 
from, 152-153 
defined, 99 
of errors of prediction, 135-136 


square distributions, 


attenuation, 


computation 
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Standard deviation (Continued) 
formulas based on coding, 118-119 
functions of, 107 
normal curve, 

278, 287 
partial, 134 
properties of, 119 
Sheppard’s correction, 118 
standard scores and, 107-111 
transformations, 107-11 1 
visual checks, after computation from 

frequency distribution, 120 

Standard error, correlation coefficient 

332-333 
described 330-331 
of estimate, 136, 249 
examples of, 331-333 
interpretation of, 332 
mean, 33] 
measurement, 401-402 
median, 332 

Standard scores, 413-414 
computation, 107-111 
described, 413 
norms, 414 
variables and, 107-11] 


Stanine scale in distribution normalizing, 
298-300 


Statistical norms 
414 
Statistical tests, see Tests, Statistical 
Statistics, bias of, 351 
confidence limits for, 353-354 
distribution of, 329-330 
See also Descriptive statistics 
Step interval, in frequency distribution, 
78, 79-80 
in scatter diagrams, 142, 146 
Straight line: correlation coefficient as 
slope of line of best fit, 128-130 


equation of, see Regression equation 
Stratified sampling, 327 


Subtractions of matrices, 425 


Sum formula for correlation coefficient, 
152-153 


Sum of Squares, in analysis of variance, 
360, 362-366 
calculation, 368-371 
defined, 360 
degrees of freedom and, 365, 366-367 
division into two components, 362- 
366 
relationships to correlation coefficient 
and correlation ratio, 372-375 
in two-way analysis of variance, 381- 
383 
Suppressor variables, 240, 265 
negative betas and, 184 


107, 275n, 276-277, 


for scores, 411, 412- 


AN INTRODUCTION TO PSYCHOLOGICAL STATISTICS 


Symbols, of determinants, 438-439 
glossary of, 509-513 
matrices, 421-422 
varieties of, 18-20 

Symmetric matrix, 423 


1 distribution, 312, 319-321 3 
application to correlation coefficient, 
335-337 
T score, 413 
computation, 109-111 
1 test, 366 
Tallying, 32, 34 
frequency distribution, 78, 80 
with nominal scale, 34 
in scatter diagrams, 142 
Taylor-Russell tables, 251-253 
Test-retest method for reliability, 390 
394, 396 
Test scores, see Scores 
Tests, psychological: 
homogeneous and heterogeneous, 387- 
388 
internal consistency, 408-409 
items in, 387-388; item analysis, 406- 
410; item selection, 409-410 
norm establishment for, 410-414 
random error in, 388-389 
reliability, 389-403 
See also Reliability of tests 
scoring formulas, 404—406 
validity, 387, 403-404, 409 
Tests, statistical: alternatives in, 341-342 
distribution-free, 471-481 
See also Tests of Statistical signifi- 
cance; and under specific test 
Tests of statistical significance, 351-352 
analysis of variance in, 358, 365-366 
chi square, 62-63 
correlation Coefficient, 338-339 
null hypothesis and, 340-342 
See also under specific test 
Tetrachoric correlation, 226-228, 230. 
231 
applications, 227-228 
computation, 226-227 
formula, 230-231 
Thorndike, E. L., 26 
Three variables, charts e 
tionship between, 12 
Thurstone, L. L4 25.26 
Transformations: linear, 107-111 
of variables, 14 
Transposition of matrix, 424 
Triangular matrix, 423 
Two variables: linear c 
gression, 125-163 


xpressing rela- 


Orrelation and Te- 


Two variables (Continued) 
multiple R computation with, 
170 
relationship between: charts express- 
ing, 6-10; curvilinear, 9-10; formu- 
las and, 23; frequency diagram of, 
7-9; graphing, 127-128 
Two-way frequency diagram, 7-9 
Type I error, 341-342 
Type II error, 342 


169- 


U test, 476, 478 
Uncorrelated means, 343-346 
Unimodal distribution, 82 
Unit matrix, 423, 432-434 
Unpredicted variance, 134 


Validity: in multiple correlation, 185- 
186 
of regression equation, 265-266 
tests, 387, 403-404, 409 
Variability: deviations from mean as in- 
dicators of, 103-104 
homoscedasticity around 
line, 134-136 
measures of, 14 
point measures of, 83 
range, 80, 86-87 
Variables, 1л 
analysis of variance with single inde- 
pendent, 367-371 
in beta computation, 175-176 
curvilinear correlation, 232, 234- 
235 
control of extraneous, 191-199 
in correlation, 126, 156; multiple cor- 
relation, 165-169, 177-186 
dependent (criterion), 126, 190-191 


regression 


dichotomous, 242, 243, 245-246, 247- . 


248 

elimination as control, 192-193 

expectancy charts, 242-243 

graphic representation, in factor analy- 
sis, 458-462, 465 

independent and dependent, 190-191 

mean of sum of several, 121-122 

modifying, test reliability and, 400 

in prediction, 240, 245-246, 247-248, 
249, 262-266 

residual, 166-170, 171, 199-201 

single common factor in factor analy- 
sis, 453-457 

standard scores and, 107-111 

suppressor, 184, 240, 265 

transformations of, 14 
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Variables (Continued) 
two, see Two variables 
variances and covariances of, 167-170, 
171, 217-219 
See also Residual variables 
Variance, 14 
additive properties of, 215-219 
Bartlett's test of homogeneity of, 375- 
376 
in beta computation, 176 
bias in, 328-329 
of chi square, 317 
combining statistics 
samples, 120, 121 
of composite variables, 217-219 
computation, 102, 105, 114-117, 119, 
425-427; after coding, 114-117, 119; 
from frequency distribution, 114- 
117, 119; matrix multiplication in, 
425-427; from raw scores, 139-140; 
of residual variables, 167-170, 
171 
correlation and, 106; correlation coeffi- 
cient computation from, 152-153; 
in multiple and partial correlation, 
210-211; multiple R computation 
from, 175 
defined, 99, 104-105 
division, 133-134, 135-136 
formulas for, 104-106 
heterogeneous, 376 
interpretation, 106 
known criterion, correction on inde- 
pendent variable, 261 
known predictor, correction for restric- 
tion, 261-262 
matrices and, 425-427, 428-432 
partial, 134; multiple R and, 175, 503 
Poisson distribution, 310 
predictable, of criterion, 134, 135 
prediction accuracy and, 255-257 
ratio: in analysis of variance, 362; 
F distribution and, 365 
residual variables, 167-169, 
171 
residuals of, 134 
sources, 14 
test, item statistics and, 410 
unpredicted, 134, 135 
of weighted composites, 220-221 
See also Analysis of variance 
Variates, see Variables 
Vectors of matrices, 421 


from different 


169-170, 


Wald-Wolfowitz run test, 474-475 
Wherry-Doolittle technique, 177-184, 265 
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Standard deviation (Continued) 
formulas based on coding, 118-119 
functions of, 107 
normal curve, 

278, 287 
partial, 134 
properties of, 119 
Sheppard’s correction, 118 
standard scores and, 107-111 
transformations, 107-111 
visual checks, after computation from 

frequency distribution, 120 

Standard error, correlation coefficient, 

332-333 
described 330-331 
of estimate, 136, 249 
examples of, 331-333 
interpretation of, 332 
mean, 331 
measurement, 401—402 
median, 332 

Standard scores, 413-414 
computation, 107-111 
described, 413 
norms, 414 
variables and, 107-111 

Stanine scale in distribution normalizing, 

298-300 

меен norms for scores, 41 1, 412- 


107, 275n, 276-277, 


Statistical tests, see Tests, statistical 
Statistics, bias of, 351 
confidence limits for, 353-354 
distribution of, 329-330 
_ See also Descriptive statistics 
Step interval, in frequency distribution, 
. T8, 79-80 
In scatter diagrams, 142, 146 
Straight line: Correlation coefficient as 
slope of line of best fit, 128-130 
equation of, see Regression equation 
Stratified sampling, 327 
Subtractions of matrices, 425 
Sum formula for correlation coefficient, 
152-153 
Sum of Squares, in analysis of variance, 
360, 362-366 
calculation, 368-371 
defined, 360 
degrees of freedom and, 365, 366-367 
division into two components, 362- 
366 
relationships to correlation coefficient 
and correlation ratio, 372-375 
in two-way analysis of variance, 381- 
383 
Suppressor variables, 240, 265 
negative betas and, 184 
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Symbols, of determinants, 438-439 
glossary of, 509-513 
matrices, 421-422 
varieties of, 18-20 

Symmetric matrix, 423 


t distribution, 312, 319-321 
application to correlation coefficient, 
335-337 
T score, 413 
computation, 109-111 
t test, 366 
Tallying, 32, 34 
frequency distribution, 78, 80 
with nominal scale, 34 
in scatter diagrams, 142 
Taylor-Russell tables, 251-253 
Test-retest method for reliability, 390, 
394, 396 
Test scores, see Scores 
Tests, psychological: 
homogeneous and heterogeneous, 387- 
388 
internal consistency, 408-409 
items in, 387-388; item analysis, 406- 
410; item selection, 409-410 
norm establishment for, 410-414 
random error in, 388-389 
reliability, 389-403 
See also Reliability of tests 
scoring formulas, 404—406 
validity, 387, 403-404, 409 
Tests, statistical: alternatives in, 341-342 
distribution-free, 471-481 
See also Tests of statistical signifi- 
cance; and under specific test 
Tests of statistical significance, 351-352 
analysis of variance in, 358, 365-366 
chi square, 62-63 
correlation coefficient, 338-339 
null hypothesis and, 340-342 
See also under specific test 
Tetrachoric correlation, 226-228, 230- 
231 
applications, 227-228 
computation, 226-227 
formula, 230-231 
Thorndike, E. L., 26 
Three variables, charts expressing rela- 
tionship between, 12 
Thurstone, L. L., 25, 26 
Transformations: linear, 107-1 11 
of variables, 14 
Transposition of matrix, 424 
Triangular matrix, 423 
Two variables: linear correlation and re- 
gression, 125-163 


Two variables (Continued) 
multiple R computation with, 169- 
170 
relationship between: charts express- 
ing, 6-10; curvilinear, 9-10; formu- 
las and, 23; frequency diagram of, 
7-9; graphing, 127-128 
Two-way frequency diagram, 7-9 
Type I error, 341-342 
Type П error, 342 


U test, 476, 478 
Uncorrelated means, 343-346 
Unimodal distribution, 82 
Unit matrix, 423, 432-434 
Unpredicted variance, 134 


Validity: in multiple correlation, 185- 
186 
of regression equation, 265-266 
tests, 387, 403—404, 409 
Variability: deviations from mean as in- 
dicators of, 103-104 
homoscedasticity around regression 
line, 134-136 
measures of, 14 
point measures of, 83 
range, 80, 86-87 
Variables, 1п 
analysis of variance with single inde- 
pendent, 367-371 
in beta computation, 175-176 
curvilinear correlation, 232, 234- 
235 
control of extraneous, 191-199 
in correlation, 126, 156; multiple cor- 
relation, 165-169, 177-186 
dependent (criterion), 126, 190-191 


dichotomous, 242, 243, 245-246, 247- | 


248 

elimination as control, 192-193 

expectancy charts, 242-243 

graphic representation, in factor analy- 
sis, 458-462, 465 

independent and dependent, 190-191 

mean of sum of several, 121-122 

modifying, test reliability and, 400 

in prediction, 240, 245-246, 247-248, 
249, 262-266 

residual, 166-170, 171, 199-201 

single common factor in factor analy- 
sis, 453-457 

standard scores and, 107-111 

suppressor, 184, 240, 265 

transformations of, 14 
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Variables (Continued) 
two, see Two variables 
variances and covariances of, 167-170, 
171, 217-219 
See also Residual variables 
Variance, 14 
additive properties of, 215-219 
Bartlett’s test of homogeneity of, 375- 
376 
in beta computation, 176 
bias in, 328-329 
of chi square, 317 
combining statistics from different 
samples, 120, 121 
of composite variables, 217-219 
computation, 102, 105, 114-117, 119, 
425-427; after coding, 114-117, 119; 
from frequency distribution, 114- 
117, 119; matrix multiplication in, 
425-427; from raw scores, 139-140; 
of residual variables, 167-170, 
171 
correlation and, 106; correlation coeffi- 
cient computation from, 152-153; 
in multiple and partial correlation, 
210-211; multiple R computation 
from, 175 
defined, 99, 104-105 
division, 133-134, 135-136 
formulas for, 104-106 
heterogeneous, 376 
interpretation, 106 
known criterion, correction on inde- 
pendent variable, 261 
known predictor, correction for restric- 
tion, 261-262 
matrices and, 425-427, 428-432 
partial, 134; multiple R and, 175, 503 
Poisson distribution, 310 
predictable, of criterion, 134, 135 
.prediction aécuracy and, 255-257 
ratio: in analysis of variance, 362; 
F distribution and, 365 
residual variables, 167-169, 169-170, 


171 
residuals of, 134 
sources, 14 


test, item statistics and, 410 
unpredicted, 134, 135 
of weighted composites, 220-221 
See also Analysis of variance 
Variates, see Variables 
Vectors of matrices, 421 


Wald-Wolfowitz run test, 474-475 
Wherry-Doolittle technique, 177-184, 265 
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Yates, F., 61 
Yates’ correction for continuity, 67-69 


z. transformation of correlation coeffi- 
cient, 337-338, 339-340, 504-507 
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2 score, see Standard scores 

Zero: requirement for 
100 

Zero matrix, 423, 434 

Zero order, defined, 174 


ratio scales, 
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